<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="../../assets/xml/rss.xsl" media="all"?><rss version="2.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Mariano Guerra's Log (Publicaciones sobre PapersOfTheWeek)</title><link>http://marianoguerra.org/</link><description></description><atom:link href="http://marianoguerra.org/es/categories/papersoftheweek.xml" rel="self" type="application/rss+xml"></atom:link><language>es</language><lastBuildDate>Mon, 18 Nov 2024 17:56:44 GMT</lastBuildDate><generator>Nikola (getnikola.com)</generator><docs>http://blogs.law.harvard.edu/tech/rss</docs><item><title>Papers (and other things) of the LargeSpanOfTime II</title><link>http://marianoguerra.org/es/posts/papers-and-other-things-of-the-largespanoftime-ii/</link><dc:creator>Mariano Guerra</dc:creator><description>&lt;p&gt;OK, the title is getting fuzzier and fuzzier, but I decided to condense some
things I've been reading here.&lt;/p&gt;
&lt;p&gt;Papers:&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="https://people.mpi-sws.org/~rossberg/papers/Haas,%20Rossberg,%20Schuff,%20Titzer,%20Gohman,%20Wagner,%20Zakai,%20Bastien,%20Holman%20-%20Bringing%20the%20Web%20up%20to%20Speed%20with%20WebAssembly.pdf"&gt;Bringing the Web up to Speed with WebAssembly&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;I like compilers, and their implementations, so I've been following WebAssembly,
this is a good place to look at.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/45855.pdf"&gt;Spanner, TrueTime &amp;amp; The CAP Theorem&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;A blog post by google made the rounds lately with people saying that google was
saying that they beat the CAP Theorem, so I went to the source. The conclusion
is interesting:&lt;/p&gt;
&lt;pre class="literal-block"&gt;Spanner reasonably claims to be an “effectively CA” system despite operating over a wide area, as it is
always consistent and achieves greater than 5 9s availability. As with Chubby, this combination is possible
in practice if you control the whole network, which is rare over the wide area. Even then, it requires
significant redundancy of network paths, architectural planning to manage correlated failures, and very
careful operations, especially for upgrades. Even then outages will occur, in which case Spanner chooses
consistency over availability.
Spanner uses two-phase commit to achieve serializability, but it uses TrueTime for external consistency,
consistent reads without locking, and consistent snapshots.&lt;/pre&gt;
&lt;p&gt;&lt;a class="reference external" href="https://bitcoin.org/bitcoin.pdf"&gt;Bitcoin: A Peer-to-Peer Electronic Cash System&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;Again, many people ranting and raving about bitcoin, blockchain and cryptocurrencies,
what's better than go to the source, really readable paper.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="https://www.computer.org/cms/Computer.org/ComputingNow/homepage/2012/0512/T_CO2_CAP12YearsLater.pdf"&gt;CAP Twelve Years Later: How the “Rules” Have Changed&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;I have a deja vu that I already read this paper, but just to be sure I read it
again, interesting summary of the concepts and how they evolved over time.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="http://webpages.eng.wayne.edu/%7Efj9817/papers/lsm-trie.pdf"&gt;LSM-trie: An LSM-tree-based Ultra-Large Key-Value Store for Small Data&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;I wanted to read the LSM-tree paper and it seems I didn't look what I was clicking
so instead I ended up reading the LSM-trie paper, which is really interesting
and has an overview of the LSM-tree one, now I have to go and read that one too.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="http://homepages.inf.ed.ac.uk/wadler/papers/prettier/prettier.pdf"&gt;A prettier printer Philip Wadler&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;In a &lt;a class="reference external" href="http://marianoguerra.org/posts/papers-of-the-week-iv/"&gt;previous post&lt;/a&gt;
I mentioned that I read "The Design of a Pretty-printing Library"
and I was expecting something else, well, this paper is a something else that I liked more.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="http://cseweb.ucsd.edu/~vahdat/papers/mop.pdf"&gt;Metaobject protocols: Why we want them and what else they can do&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;Being an aspiring &lt;a class="reference external" href="http://wiki.c2.com/?SmugLispWeenie"&gt;Smug Lisp Weenie&lt;/a&gt;
I had to read this one, it's a nice paper and puts a name on some "patterns" that
I've observed but couldn't describe clearly.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="https://pdfs.semanticscholar.org/75d0/343ee6a2c4cef9e7b8057f8088e5f82d165c.pdf?_ga=2.87388440.1430389071.1504186461-673991587.1504186461"&gt;The Cube Data Model: A Conceptual Model and Algebra for On-Line Analytical Processing in Data Warehouses&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;I've been thinking lately about the relation between Pivot Tables, Data Cubes and
the things mentioned in the paper &lt;a class="reference external" href="http://vita.had.co.nz/papers/layered-grammar.pdf"&gt;A Layered Grammar of Graphics&lt;/a&gt;
so I started reading more about Data Cubes, I skimmed a couple papers that I
forgot to register somewhere but this one was one I actually registered.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="http://web.mit.edu/Saltzer/www/publications/endtoend/endtoend.pdf"&gt;End-to-End Arguments in System Design&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;Someone somewhere mentioned this paper so I went to look, it's a really good
one, like the Metaobject protocol paper and other's I've read, this one is like
a condensation of years of knowledge and experiences that are really
interesting to read.&lt;/p&gt;
&lt;p&gt;Books:&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="https://cs.oberlin.edu/~jwalker/refs/betaBook.pdf"&gt;Object-Oriented Programming in the Beta Programming Language&lt;/a&gt;:&lt;/p&gt;
&lt;p&gt;Interesting book about a really interesting (and different) object oriented
programming language by the creators of Simula (aka the creators of object
orientation), it explains an abstraction called "patterns" in which all other
abstractions are expressed.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="http://www.ethoberon.ethz.ch/WirthPubl/ProjectOberon.pdf"&gt;Project Oberon&lt;/a&gt; The Design of an Operating System and Compiler:&lt;/p&gt;
&lt;p&gt;Another interesting book by Niklaus Wirth, creator of between others, Pascal,
Modula and Oberon describing how to basically create computing from scratch.&lt;/p&gt;
&lt;p&gt;I will note that I skimmed over the dense specification parts of those books
since I wasn't trying to implement nor use them.&lt;/p&gt;
&lt;p&gt;Reading:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://www.amazon.com/Unconventional-Programming-Paradigms-International-September/dp/3540278842"&gt;Unconventional Programming Paradigms&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://web.media.mit.edu/~lieber/Your-Wish/"&gt;Your wish is My Command&lt;/a&gt; Giving Users the Power to Instruct their Software&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://ia802309.us.archive.org/25/items/pdfy-MgN0H1joIoDVoIC7/The_AWK_Programming_Language.pdf"&gt;The AWK Programming Language&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Mindstorms: Children, Computers And Powerful Ideas: All About Logo, How It Was Invented And How It Works by Seymour Papert&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Papers this looong week: 11 (count books as papers because why not)&lt;/p&gt;
&lt;p&gt;Papers so far: 54&lt;/p&gt;
&lt;p&gt;Papers in queue: don't know&lt;/p&gt;</description><guid>http://marianoguerra.org/es/posts/papers-and-other-things-of-the-largespanoftime-ii/</guid><pubDate>Thu, 31 Aug 2017 13:15:48 GMT</pubDate></item><item><title>Papers of the LargeSpanOfTime I</title><link>http://marianoguerra.org/es/posts/papers-of-the-largespanoftime-i/</link><dc:creator>Mariano Guerra</dc:creator><description>&lt;p&gt;Welp, some day the experiment had to end, I stopped reading 5 papers a week because
some books arrived and I read those instead and also because I was busy at work.&lt;/p&gt;
&lt;p&gt;But that doesn't mean I didn't read papers at all, so here's a list of the ones
I did read.&lt;/p&gt;
&lt;aside class="admonition note"&gt;
&lt;p class="admonition-title"&gt;Note&lt;/p&gt;
&lt;p&gt;Since some of them I read them a while ago the reviews may not be really detailed&lt;/p&gt;
&lt;/aside&gt;
&lt;p&gt;&lt;a class="reference external" href="https://www.researchgate.net/publication/276281901_Cuneiform_A_Functional_Language_for_Large_Scale_Scientific_Data_Analysis"&gt;Cuneiform: A Functional Language for Large Scale Scientific Data Analysis&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Seems useful in practice, was expecting something else from the title.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="http://stratosphere.eu/assets/papers/2014-VLDBJ_Stratosphere_Overview.pdf"&gt;The Stratosphere platform for big data analytics&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;I remember reading a paper from what later became Apache Flink that I liked a
lot, I was looking for that one and I found this one instead (stratosphere
became flink), it was an interesting overview, would like to know how much of
that is still in flink.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="https://www.microsoft.com/en-us/research/wp-content/uploads/2016/02/Orleans-MSR-TR-2014-41.pdf"&gt;Orleans: Distributed Virtual Actors for Programmability and Scalability&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Really good paper, I like how it's written and the idea and implementation.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="http://asc.di.fct.unl.pt/~jleitao/pdf/dsn07-leitao.pdf"&gt;HyParView: a membership protocol for reliable gossip-based broadcast&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="http://www.gsd.inesc-id.pt/~jleitao/pdf/srds07-leitao.pdf"&gt;Epidemic Broadcast Trees&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;This too reviewed together because they are like bread and butter, I love both
of them, highly recommended.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="http://www.gsd.inesc-id.pt/~jleitao/pdf/danms08-leitao.pdf"&gt;Large-Scale Peer-to-Peer Autonomic Monitoring&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;I won't lie to you, I don't remember much about this one, but given the authors it must be good :)&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="http://hirzels.com/martin/papers/ecoop14-activesheets.pdf"&gt;Stream Processing with a Spreadsheet&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="http://sdg.csail.mit.edu/projects/objsheets/objsheets-onward2016.pdf"&gt;Object Spreadsheets: A New Computational Model for End-User Development of Data-Centric Web Applications&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;I was looking for ideas and inspiration when I read these two, I liked both,
Object Spreadsheets being the most interesting aproach.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="http://vita.had.co.nz/papers/layered-grammar.pdf"&gt;A Layered Grammar of Graphics&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Great paper, on my top list, maybe because I love the topic :)&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="http://www.vs.inf.ethz.ch/publ/papers/VirtTimeGlobStates.pdf"&gt;Virtual Time and Global States of Distributed Systems&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;A must read if interested in vector clocks, the non math parts are good, I don't enjoy reading theormes a lot (not their fault).&lt;/p&gt;
&lt;p&gt;Papers this looong week: 10&lt;/p&gt;
&lt;p&gt;Papers so far: 43&lt;/p&gt;
&lt;p&gt;Papers in queue: don't want to count anymore&lt;/p&gt;</description><guid>http://marianoguerra.org/es/posts/papers-of-the-largespanoftime-i/</guid><pubDate>Tue, 15 Nov 2016 20:47:29 GMT</pubDate></item><item><title>Papers of the Week VII</title><link>http://marianoguerra.org/es/posts/papers-of-the-week-vii/</link><dc:creator>Mariano Guerra</dc:creator><description>&lt;p&gt;Because nothing lasts forever and after a week half traveling and a busy one
I managed to read 4 papers this week.&lt;/p&gt;
&lt;p&gt;The first one was interesting but comes from an area I will describe as "let's
bend relational databases to fit Event Stream Processing", which is not bad per
se but has things like joins and being able to remember past events that make
its scalability (at least in terms of memory) quite hard, also it never discuses
distribution, which is ok for the field but not what I'm looking for.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://www.cs.cornell.edu/johannes/papers/2007/2007-cidr-cayuga.pdf"&gt;Cayuga: A General Purpose Event Monitoring System&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The interesting part about this one is the part where it introduces
&lt;a class="reference external" href="https://scholar.google.de/scholar?hl=en&amp;amp;q=visibly+pushdown+languages"&gt;Visibly Pushdown Languages&lt;/a&gt;
something that looks really interesting but I couldn't find an introduction
for mere mortals, the descriptions are really dense an mathematical, which is
ok but hard to learn for outsiders like me.&lt;/p&gt;
&lt;p&gt;Another interesting point is the fact that it uses the XML Schema to optimize
the generated VPA (Visibly Pushdown Automata) and that the implementation not
only applies to XML but to any nested semistructured data.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://people.csail.mit.edu/barzan/papers/sigmod_2012.pdf"&gt;High-Performance Complex Event Processing over XML Streams&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The review of the next one will seem conflicting with my previous reviews, but
this one had too much enfasis on the low level implementation details, not
novel things and optimizations, just a lot of details, like the guys found the
implementation really cool and wanted to share it with the world. Not a bad
thing per se, but in this batch I was looking for abstractions, optimizations
and distribution characteristics of stream processing, better if focused on
distributed systems, and this one talked mainly about the DSL they build that
compiles to C. It also sorts the streams, does multiple passes over the data,
does lookahead in the stream and does a kind of "micro batches" which isn't
what I was looking for.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://www.researchgate.net/profile/Daryl_Pregibon/publication/2399935_Hancock_A_Language_for_Extracting_Signatures_from_Data_Streams/links/54350d640cf294006f737e90.pdf"&gt;Hancock: A Language for Extracting Signatures from Data Streams&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The last one, I found the approach interesting, they seemed to try to push the
purity of the approach (everything is a regular expression) which may have end
up with a nice model (a thing I like) but by reading the code it doesn't seem
to be really clear, at least for a OO/functional background, and I think less
for non programmers. Maybe the syntax doesn't help and some other syntax would
make things clearer, I don't know.&lt;/p&gt;
&lt;p&gt;Other than that the approach is interesting and it made me think on some ways
to define a stream processing language using mainly pattern matching.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://www.ii.uib.no/~karltk/phd/papers/LCTES2008.pdf"&gt;EventScript: An Event-Processing Language Based on Regular Expressions with Actions&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Papers this week: 4&lt;/p&gt;
&lt;p&gt;Papers so far: 33&lt;/p&gt;
&lt;p&gt;Papers in queue: 76&lt;/p&gt;</description><guid>http://marianoguerra.org/es/posts/papers-of-the-week-vii/</guid><pubDate>Tue, 14 Jun 2016 09:24:56 GMT</pubDate></item><item><title>Papers of the Week VI</title><link>http://marianoguerra.org/es/posts/papers-of-the-week-vi/</link><dc:creator>Mariano Guerra</dc:creator><description>&lt;p&gt;Better late than never (even when I read all the papers last week) here
is the sixth installment of Papers of the Week.&lt;/p&gt;
&lt;p&gt;Starting next week I will try to write the reviews after I read the papers and
not almost one week after when my memories are fuzzy :)&lt;/p&gt;
&lt;p&gt;The fist one describes an implementation of out of order processing using
punctuation, interesting in that it "applies" the concept of punctuation to
building a streaming system and analyzes the result.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://www.vldb.org/pvldb/1/1453890.pdf"&gt;Out-of-Order Processing: A New Architecture for High- Performance Stream Systems&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This one describes an implementation of a storage engine using LSM Trees and a
compression technique.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://www.vldb.org/pvldb/1/1453914.pdf"&gt;Rose: Compressed, log-structured replication&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;You can read an overview of the next paper and find the link to it at &lt;a class="reference external" href="https://blog.acolyer.org/2015/10/16/holistic-configuration-management-at-facebook/"&gt;acolyer's paper of the day: Holistic Configuration Management at Facebook&lt;/a&gt;, I copy the first paragraph here:&lt;/p&gt;
&lt;pre class="literal-block"&gt;This paper gives a comprehensive description of the use cases, design,
implementation, and usage statistics of a suite of tools that manage
Facebook’s conﬁguration end-to-end, including the frontend products,
backend systems, and mobile apps.&lt;/pre&gt;
&lt;p&gt;It's a good overview of tools and techniques used to scale and standardize
configuration management and how to avoid problems introduced by sloppy
configuration management.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://sigops.org/sosp/sosp15/current/2015-Monterey/printable/008-tang.pdf"&gt;Holistic Configuration Management at Facebook&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The next one is my favorite of the week, it defines a baseline by implementing
solutions from other papers that introduce some parallelization strategy by
implementing them in a simple single threaded way and benchmarking it against
other solutions, then defined a "metric" that describes how many cores are
required to match the single thread implementation, as many sites would tell
you "the result will amaze you".&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://www.frankmcsherry.org/assets/COST.pdf"&gt;Scalability! But at what COST?&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The last one for this week surprisingly brought me to the CRDT/Lasp/&lt;a class="reference external" href="https://twitter.com/cmeik"&gt;@cmeik&lt;/a&gt; land, when the title didn't seemed to imply that, the crazy fact is that
I saw a talk about this paper at RICON 2015 and I didn't remembered the title :)&lt;/p&gt;
&lt;p&gt;Some parts where hard for me since it's the first paper I read about CRDTs so I
don't have the vocabulary and basic theory in place but it made me think on
some interesting applications on the IoT and monitoring spaces.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://christophermeiklejohn.com/publications/edgecom-2016-preprint.pdf"&gt;Declarative, Sliding Window Aggregations for Computations at the Edge&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Papers this week: 5&lt;/p&gt;
&lt;p&gt;Papers so far: 29&lt;/p&gt;
&lt;p&gt;Papers in queue: 82&lt;/p&gt;</description><guid>http://marianoguerra.org/es/posts/papers-of-the-week-vi/</guid><pubDate>Wed, 01 Jun 2016 08:45:44 GMT</pubDate></item><item><title>Papers of the Week V</title><link>http://marianoguerra.org/es/posts/papers-of-the-week-v/</link><dc:creator>Mariano Guerra</dc:creator><description>&lt;p&gt;I'm already late so let's go:&lt;/p&gt;
&lt;p&gt;The first one, Discretized Streams, is the one I liked the most, it's about the
the theory behind what became Spark Streaming, really interesting.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://www.cs.berkeley.edu/~matei/papers/2013/sosp_spark_streaming.pdf"&gt;Discretized Streams: Fault-Tolerant Streaming Computation at Scale&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The second one is interesting in its introduction of punctuation which it
explains really well.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://www.semanticscholar.org/paper/Exploiting-Punctuation-Semantics-in-Continuous-Tucker-Maier/8375f40706943a50094acf909849a6bc611fe5e9/pdf"&gt;Exploiting Punctuation Semantics in Continuous Data Streams&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Didn't liked this one too much, doesn't mean it's bad, just that sometimes the
title gives me an idea on what it's about and when it's not I loose interest&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://research.microsoft.com/apps/pubs/default.aspx?id=156569"&gt;Consistent Streaming Through Time: A Vision for Event Stream Processing&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This one was interesting in its description of different types of windows and
the definition of windows sematincs, I get the feeling that if I read it back
in some not too distant future I will get a lot more out of it.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://web.cs.wpi.edu/~cs525/f06s-EAR/cs525-homepage_files/LITERATURE/sigmod05-ogi.pdf"&gt;Semantics and Evaluation Techniques for Window Aggregates in Data Streams&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;More statistical than I tought it would be, but still I learned some things
about random sampling. I guess it's one of those papers that are great if you
are looking for a solution and this paper tells you what to implement.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://arxiv.org/pdf/1012.0256.pdf"&gt;Weighted Random Sampling over Data Streams&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Papers this week: 5&lt;/p&gt;
&lt;p&gt;Papers so far: 24&lt;/p&gt;
&lt;p&gt;Papers in queue: 85 (I cleaned some duplicates and similar papers)&lt;/p&gt;</description><guid>http://marianoguerra.org/es/posts/papers-of-the-week-v/</guid><pubDate>Mon, 23 May 2016 21:00:07 GMT</pubDate></item><item><title>Papers of the Week IV</title><link>http://marianoguerra.org/es/posts/papers-of-the-week-iv/</link><dc:creator>Mariano Guerra</dc:creator><description>&lt;p&gt;Better late than never, and proving that I can count to 4, here we go with
the 4th straight paper reading week.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://research.google.com/pubs/pub41378.html"&gt;MillWheel: Fault-Tolerant Stream Processing at Internet Scale&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I liked the MillWheel paper, Google and Microsoft write really nice papers from
the ones I've read.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://research.microsoft.com/en-us/projects/dryad/eurosys07.pdf"&gt;Dryad: Distributed Data-Parallel Programs from Sequential Building Blocks&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Didn't liked the dryad paper, was expecting something else.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://research.microsoft.com/apps/pubs/default.aspx?id=66814"&gt;PacificA: Replication in Log-Based Distributed Storage Systems&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The PacificA paper is my favorite of the week.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://research.microsoft.com/apps/pubs/?id=201100"&gt;Naiad: A Timely Dataflow System&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;About Naiad, I liked the idea about tracking distributed progress.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://www.cse.chalmers.se/~rjmh/Papers/pretty.html"&gt;The Design of a Pretty-printing Library&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;I had big expectations for this paper, but it was too haskellish for my taste, I
was expecting something else.&lt;/p&gt;
&lt;p&gt;Papers this week: 5&lt;/p&gt;
&lt;p&gt;Papers so far: 19&lt;/p&gt;
&lt;p&gt;Papers in queue: 91&lt;/p&gt;</description><guid>http://marianoguerra.org/es/posts/papers-of-the-week-iv/</guid><pubDate>Mon, 16 May 2016 14:18:48 GMT</pubDate></item><item><title>Papers of the Week III</title><link>http://marianoguerra.org/es/posts/papers-of-the-week-iii/</link><dc:creator>Mariano Guerra</dc:creator><description>&lt;p&gt;No, I didn't gave up, last week was short because of a holyday and a "bridge day"
so I was riding my bike through the black forest, but I still read the papers
I set up to read by cramming all of them in 3 days :)&lt;/p&gt;
&lt;p&gt;First, a classic in distributed systems :)&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://www.comp.nus.edu.sg/~gilbert/pubs/BrewersConjecture-SigAct.pdf"&gt;Brewer’s Conjecture and the Feasibility of Consistent, Available, Partition-Tolerant Web Services&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;More on the "topic"&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://www.cs.berkeley.edu/~brewer/papers/mashah-flux-final.pdf"&gt;Highly Available, Fault-Tolerant, Parallel Dataflows&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This is a great paper, I guess is one of the first "papers I love"&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://arxiv.org/pdf/1603.00567v1.pdf"&gt;MacroBase: Analytic Monitoring for the Internet of Things&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;This one is interesting, much of the content sounds like what the creators of
kafka propose, you can see what I mean by watching a talk like &lt;a class="reference external" href="https://www.youtube.com/watch?v=fU9hR3kiOK0"&gt;"turning the database inside out"&lt;/a&gt;, and by reading the paper, which is quite short.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://cidrdb.org/cidr2011/Papers/CIDR11_Paper25.pdf"&gt;Towards a One Size Fits All Database Architecture&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Didn't read the paper, just the blog post summary, but it's quite descriptive.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://blog.acolyer.org/2016/05/03/gorilla-a-fast-scalable-in-memory-time-series-database/"&gt;Gorilla: a fast scalable in memory time series database&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;It has an interesting compression technique and a quote I liked:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;We found that building a reliable, fault tolerant system was the most time
consuming part of the project. While the team prototyped a high
performance, compressed, in-memory TSDB in a very short period of time, it
took several more months of hard work to make it fault tolerant. However,
the advantages of fault tolerance were visible when the system successfully
survived both real and simulated failures.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Related to the MacroBase paper:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://www.youtube.com/watch?v=077ZyyuDXYY"&gt;https://www.youtube.com/watch?v=077ZyyuDXYY&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://www.futuredata.io/"&gt;http://www.futuredata.io/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://macrobase.io/"&gt;http://macrobase.io/&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Papers this week: 4&lt;/p&gt;
&lt;p&gt;Papers so far: 14&lt;/p&gt;
&lt;p&gt;Papers in queue: 94&lt;/p&gt;
&lt;p&gt;It seems I add 30 papers to the queue for each 5 I read, I hope it's not linear :)&lt;/p&gt;</description><guid>http://marianoguerra.org/es/posts/papers-of-the-week-iii/</guid><pubDate>Tue, 10 May 2016 07:17:58 GMT</pubDate></item><item><title>Papers of the Week II</title><link>http://marianoguerra.org/es/posts/papers-of-the-week-ii/</link><dc:creator>Mariano Guerra</dc:creator><description>&lt;p&gt;In my continuous attempt to see how far I can count in roman numerals here is
the second week, still going, still 5 papers.&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://static.googleusercontent.com/media/research.google.com/en//archive/bigtable-osdi06.pdf"&gt;Bigtable: A Distributed Storage System for Structured Data&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://pdos.csail.mit.edu/archive/6.824-2004/papers/hagmann-fs.pdf"&gt;Reimplementing the Cedar File System Using Logging and Group Commit&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://static.googleusercontent.com/media/research.google.com/en//archive/sawzall-sciprog.pdf"&gt;Interpreting the Data: Parallel Analysis with Sawzall&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://www.eecs.harvard.edu/~mdw/papers/seda-sosp01.pdf"&gt;SEDA: An Architecture for Well-Conditioned, Scalable Internet Services&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://static.googleusercontent.com/media/research.google.com/en//pubs/archive/43864.pdf"&gt;The Dataflow Model: A Practical Approach to Balancing Correctness, Latency, and Cost in Massive-Scale, Unbounded, Out-of-Order Data Processing&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The one I liked the most was "The Dataflow Model..." mainly because it fired
some ideas related to a problem I'm trying to solve.&lt;/p&gt;
&lt;p&gt;The others were also good except "Reimplementing the Cedar File System Using
Logging and Group Commit", mainly because it wasn't what I was expecting it
would be.&lt;/p&gt;
&lt;p&gt;Read this week: 5&lt;/p&gt;
&lt;p&gt;Total read: 10&lt;/p&gt;
&lt;p&gt;In the Queue: 61&lt;/p&gt;</description><guid>http://marianoguerra.org/es/posts/papers-of-the-week-ii/</guid><pubDate>Sat, 30 Apr 2016 11:35:19 GMT</pubDate></item><item><title>Papers of The Week I</title><link>http://marianoguerra.org/es/posts/papers-of-the-week-i/</link><dc:creator>Mariano Guerra</dc:creator><description>&lt;p&gt;This is an attempt to treat what I would call acolyer's syndrome which is the
gilt felt by people that would like to read papers as often as &lt;a class="reference external" href="https://blog.acolyer.org/"&gt;Adrian Colyer&lt;/a&gt; but never do.&lt;/p&gt;
&lt;p&gt;So I will blog the ones I read here to try to follow &lt;a class="reference external" href="http://lifehacker.com/281626/jerry-seinfelds-productivity-secret"&gt;Jerry Seinfeld's Productivity Secret&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;I will blog weekly because I won't read a paper a day, but I will try to read
around 4 or 5 papers a week if they are around 12~15 pages, if they are longer
I will read less.&lt;/p&gt;
&lt;p&gt;The initial topics are stream processing systems and distributed systems, I
will follow the references that I find interesting to inform future papers.&lt;/p&gt;
&lt;p&gt;I will also read papers that I find interesting as I go.&lt;/p&gt;
&lt;p&gt;ok, without further ado, here are the ones I read this week.&lt;/p&gt;
&lt;p&gt;Related to stream processing:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://hirzels.com/martin/papers/debs12-cep.pdf"&gt;Partition and Compose: Parallel Complex Event Processing&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="https://sungsoo.github.io/papers/ql-cql.pdf"&gt;The CQL Continuous Query Language: Semantic Foundations and Query Execution&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://cs.brown.edu/research/aurora/vldb02.pdf"&gt;Monitoring Streams – A New Class of Data Management Applications&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Clasics I wanted to read:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://www.read.seas.harvard.edu/~kohler/class/cs239-w08/decandia07dynamo.pdf"&gt;Dynamo: Amazon’s Highly Available Key-value Store&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a class="reference external" href="http://research.microsoft.com/en-us/um/people/lamport/pubs/time-clocks.pdf"&gt;Time, Clocks, and the Ordering of Events in a Distributed System&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In queue: 33&lt;/p&gt;</description><guid>http://marianoguerra.org/es/posts/papers-of-the-week-i/</guid><pubDate>Sun, 24 Apr 2016 20:40:47 GMT</pubDate></item></channel></rss>