<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="/global/feed/rss.xslt" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:media="http://search.yahoo.com/mrss/" xmlns:podaccess="https://access.acast.com/schema/1.0/" xmlns:acast="https://schema.acast.com/1.0/">
    <channel>
		<ttl>60</ttl>
		<generator>acast.com</generator>
		<title>Disseminate: The Computer Science Research Podcast</title>
		<link>https://shows.acast.com/disseminate</link>
		<atom:link href="https://feeds.acast.com/public/shows/629a6154b4e1e70012764c00" rel="self" type="application/rss+xml"/>
		<language>en</language>
		<copyright>Jack Waudby</copyright>
		<itunes:keywords/>
		<itunes:author>Jack Waudby</itunes:author>
		<itunes:subtitle/>
		<itunes:summary><![CDATA[<p>This podcast features interviews with Computer Science researchers. Hosted by <a href="https://jackwaudby.github.io/" rel="noopener noreferrer" target="_blank">Dr. Jack Waudby</a> researchers are interviewed, highlighting the problem(s) they tackled, solutions they developed, and how their findings can be applied in practice. This podcast is for industry practitioners, researchers, and students, aims to further narrow the gap between research and practice, and to generally make awesome Computer Science research more accessible. We have 2 types of episode: (i) <strong>Cutting Edge </strong>(red/blue logo) where we talk to researchers about their latest work, and (ii) <strong>High Impact</strong> (gold/silver logo) where we talk to researchers about their influential work.</p><br><p><strong>You can support the show through&nbsp;</strong><a href="https://www.buymeacoffee.com/disseminate" rel="noopener noreferrer" target="_blank"><strong>Buy Me a Coffee</strong></a>.&nbsp;<strong>A donation of $3 will help us keep making you awesome Computer Science research podcasts.&nbsp;</strong></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		<description><![CDATA[<p>This podcast features interviews with Computer Science researchers. Hosted by <a href="https://jackwaudby.github.io/" rel="noopener noreferrer" target="_blank">Dr. Jack Waudby</a> researchers are interviewed, highlighting the problem(s) they tackled, solutions they developed, and how their findings can be applied in practice. This podcast is for industry practitioners, researchers, and students, aims to further narrow the gap between research and practice, and to generally make awesome Computer Science research more accessible. We have 2 types of episode: (i) <strong>Cutting Edge </strong>(red/blue logo) where we talk to researchers about their latest work, and (ii) <strong>High Impact</strong> (gold/silver logo) where we talk to researchers about their influential work.</p><br><p><strong>You can support the show through&nbsp;</strong><a href="https://www.buymeacoffee.com/disseminate" rel="noopener noreferrer" target="_blank"><strong>Buy Me a Coffee</strong></a>.&nbsp;<strong>A donation of $3 will help us keep making you awesome Computer Science research podcasts.&nbsp;</strong></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
		<itunes:explicit>false</itunes:explicit>
		<itunes:owner>
			<itunes:name>Jack Waudby</itunes:name>
			<itunes:email>info+629a6154b4e1e70012764c00@mg-eu.acast.com</itunes:email>
		</itunes:owner>
		<acast:showId>629a6154b4e1e70012764c00</acast:showId>
		<acast:showUrl>disseminate</acast:showUrl>
		<acast:signature key="EXAMPLE" algorithm="aes-256-cbc"><![CDATA[wbG1Z7+6h9QOi+CR1Dv0uQ==]]></acast:signature>
		<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmTHg2/BXqPr07kkpFZ5JfhvEZqggcpunI6E1w81XpUaBscFc3skEQ0jWG4GCmQYJ66w6pH6P/aGd3DnpJN6h/CD4icd8kZVl4HZn12KicA2k]]></acast:settings>
        <acast:network id="629a6154b4e1e70012764c03" slug="jack-waudby"><![CDATA[JACK WAUDBY]]></acast:network>
		<itunes:type>episodic</itunes:type>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1658226600118-c8a7fa0e10288202fba2d9721149a154.jpeg"/>
			<image>
				<url>https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1658226600118-c8a7fa0e10288202fba2d9721149a154.jpeg</url>
				<link>https://shows.acast.com/disseminate</link>
				<title>Disseminate: The Computer Science Research Podcast</title>
			</image>
		<item>
			<title>Mateusz Gienieczko | AnyBlox: A Framework for Self-Decoding Datasets | #69</title>
			<itunes:title>Mateusz Gienieczko | AnyBlox: A Framework for Self-Decoding Datasets | #69</itunes:title>
			<pubDate>Tue, 17 Mar 2026 07:00:00 GMT</pubDate>
			<itunes:duration>1:02:28</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/6988f792ba7d04f1d4c678af/media.mp3" length="59980992" type="audio/mpeg"/>
			<guid isPermaLink="false">6988f792ba7d04f1d4c678af</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.vldb.org/pvldb/vol18/p4017-gienieczko.pdf</link>
			<acast:episodeId>6988f792ba7d04f1d4c678af</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>mateusz-gienieczko-anyblox</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4Kc2pcTnzrL8eBZSYLWMZIeEarLIo3Or27oRZGz0TbMUt4dn+VOw1qCCV1FMUjd4IwD/7Tk5KUgYunVR9y+dxNC]]></acast:settings>
			<itunes:subtitle>Cutting Edge</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>29</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1770575805557-771355b7-9e70-4da5-a939-8aef80bdfb4f.jpeg"/>
			<description><![CDATA[<p>In this episode of Disseminate: The Computer Science Research Podcast, host Dr. Jack Waudby is joined by Mateusz Gienieczko, PhD researcher at TU Munich and co-author of the VLDB Best Paper Award winning paper AnyBlox.</p><br><p>They dive deep into a fundamental problem in modern data systems: why cutting-edge data encodings and file formats rarely make it from research into real-world systems — and how AnyBlox proposes a radical solution.</p><br><p>Mateusz explains the core idea of self-decoding data, where datasets ship with their own portable, sandboxed decoders, allowing any database system to read any encoding safely and efficiently. Built on WebAssembly, AnyBlox bridges the long-standing gap between database research and practice without sacrificing performance, portability, or security.</p><br><p>This episode is essential listening for database researchers, data engineers, system builders, and industry practitioners interested in the future of data formats, analytics performance, and making research matter in practice</p><br><p>Links:</p><ul><li>Paper: https://www.vldb.org/pvldb/vol18/p4017-gienieczko.pdf</li><li>GitHub: https://github.com/AnyBlox</li><li>Mat's Homepage: https://v0ldek.com/</li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode of Disseminate: The Computer Science Research Podcast, host Dr. Jack Waudby is joined by Mateusz Gienieczko, PhD researcher at TU Munich and co-author of the VLDB Best Paper Award winning paper AnyBlox.</p><br><p>They dive deep into a fundamental problem in modern data systems: why cutting-edge data encodings and file formats rarely make it from research into real-world systems — and how AnyBlox proposes a radical solution.</p><br><p>Mateusz explains the core idea of self-decoding data, where datasets ship with their own portable, sandboxed decoders, allowing any database system to read any encoding safely and efficiently. Built on WebAssembly, AnyBlox bridges the long-standing gap between database research and practice without sacrificing performance, portability, or security.</p><br><p>This episode is essential listening for database researchers, data engineers, system builders, and industry practitioners interested in the future of data formats, analytics performance, and making research matter in practice</p><br><p>Links:</p><ul><li>Paper: https://www.vldb.org/pvldb/vol18/p4017-gienieczko.pdf</li><li>GitHub: https://github.com/AnyBlox</li><li>Mat's Homepage: https://v0ldek.com/</li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Xiangyao Yu | Disaggregation: A New Architecture for Cloud Databases | #68</title>
			<itunes:title>Xiangyao Yu | Disaggregation: A New Architecture for Cloud Databases | #68</itunes:title>
			<pubDate>Thu, 27 Nov 2025 05:00:00 GMT</pubDate>
			<itunes:duration>42:12</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/6924ae71ab6de17613d1f33e/media.mp3" length="35469419" type="audio/mpeg"/>
			<guid isPermaLink="false">6924ae71ab6de17613d1f33e</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.vldb.org/pvldb/vol18/p5527-xiangyao.pdf</link>
			<acast:episodeId>6924ae71ab6de17613d1f33e</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>xiangyao-yu-disaggregation</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4IAJKOauUcEm95yn73EPGemG3krW3U5H1T671+iLP2pqGCj5keQmHX4gdRhki6b9crMXPBud9U4hNjNnaOwttG0]]></acast:settings>
			<itunes:subtitle>Cutting Edge</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>28</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1764011495808-10602110-4ecc-45ae-8350-ed3b8e91e8ff.jpeg"/>
			<description><![CDATA[<p>In this episode of <em>Disseminate: The Computer Science Research Podcast</em>, host Jack Waudby sits down with Xiangyao Yu (UW–Madison), one of the leading voices shaping the next generation of cloud-native databases.</p><br><p>We dive deep into <strong>disaggregation</strong> — the architectural shift transforming how modern data systems are built. Xiangyao breaks down:</p><ul><li>Why <strong>traditional shared-nothing databases</strong> struggle in cloud environments</li><li>How <strong>separating compute and storage</strong> unlocks elasticity, scalability, and cost efficiency</li><li>The evolution of disaggregated systems, from Aurora and Snowflake through to advanced pushdown processing and new modular services</li><li>His team's research on <strong>reinventing core protocols</strong> like 2-phase commit for cloud-native environments</li><li><strong>Real-time analytics</strong>, HTAP challenges, and the Hermes architecture</li><li>Where disaggregation goes next — indexing, query optimizers, materialized views, multi-cloud architectures, and more</li></ul><p><br></p><p>Whether you're a database engineer, researcher, or a practitioner building scalable cloud systems, this episode gives a clear, accessible look into the architecture that’s rapidly becoming the <em>default</em> for modern data platforms.</p><br><p>Links:</p><ul><li><a href="https://pages.cs.wisc.edu/~yxy/" rel="noopener noreferrer" target="_blank">Xiangyao Yu's Homepage</a></li><li><a href="https://www.vldb.org/pvldb/vol18/p5527-xiangyao.pdf" rel="noopener noreferrer" target="_blank">Disaggregation: A New Architecture for Cloud Databases [VLDB'25]</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode of <em>Disseminate: The Computer Science Research Podcast</em>, host Jack Waudby sits down with Xiangyao Yu (UW–Madison), one of the leading voices shaping the next generation of cloud-native databases.</p><br><p>We dive deep into <strong>disaggregation</strong> — the architectural shift transforming how modern data systems are built. Xiangyao breaks down:</p><ul><li>Why <strong>traditional shared-nothing databases</strong> struggle in cloud environments</li><li>How <strong>separating compute and storage</strong> unlocks elasticity, scalability, and cost efficiency</li><li>The evolution of disaggregated systems, from Aurora and Snowflake through to advanced pushdown processing and new modular services</li><li>His team's research on <strong>reinventing core protocols</strong> like 2-phase commit for cloud-native environments</li><li><strong>Real-time analytics</strong>, HTAP challenges, and the Hermes architecture</li><li>Where disaggregation goes next — indexing, query optimizers, materialized views, multi-cloud architectures, and more</li></ul><p><br></p><p>Whether you're a database engineer, researcher, or a practitioner building scalable cloud systems, this episode gives a clear, accessible look into the architecture that’s rapidly becoming the <em>default</em> for modern data platforms.</p><br><p>Links:</p><ul><li><a href="https://pages.cs.wisc.edu/~yxy/" rel="noopener noreferrer" target="_blank">Xiangyao Yu's Homepage</a></li><li><a href="https://www.vldb.org/pvldb/vol18/p5527-xiangyao.pdf" rel="noopener noreferrer" target="_blank">Disaggregation: A New Architecture for Cloud Databases [VLDB'25]</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Navid Eslami | Diva: Dynamic Range Filter for Var-Length Keys and Queries | #67</title>
			<itunes:title>Navid Eslami | Diva: Dynamic Range Filter for Var-Length Keys and Queries | #67</itunes:title>
			<pubDate>Thu, 13 Nov 2025 05:00:00 GMT</pubDate>
			<itunes:duration>46:50</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/69108f06a17ebcde88fff563/media.mp3" length="41213168" type="audio/mpeg"/>
			<guid isPermaLink="false">69108f06a17ebcde88fff563</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.vldb.org/pvldb/vol18/p3923-eslami.pdf</link>
			<acast:episodeId>69108f06a17ebcde88fff563</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>navid-eslami-diva</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4IKW8k+NBnsQX1h2L39St7kutw58RJlcE5RyKle4FgVLgPXTh6LE7MvmMBoDNN8lwZEYanPNDbAK/rydLi/uuJ1]]></acast:settings>
			<itunes:subtitle>Cutting Edge</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>27</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1762690838589-ba41d32c-071e-44e8-914e-f50ef60b6d5f.jpeg"/>
			<description><![CDATA[<p>In this episode of <em>Disseminate: The Computer Science Research Podcast</em>, Jack sits down with <strong>Navid Eslami</strong>, PhD researcher at the <strong>University of Toronto</strong>, to discuss his award-winning paper <strong>“DIVA: Dynamic Range Filter for Variable Length Keys and Queries”</strong>, which earned <strong>Best Research Paper at VLDB</strong>.</p><br><p>Navid breaks down how <strong>range filters</strong> extend the power of traditional filters for modern databases and storage systems, enabling <strong>faster queries, better scalability, and theoretical guarantees</strong>. We dive into:</p><ul><li>How <strong>DIVA</strong> overcomes the limitations of existing range filters</li><li>What makes it the “holy grail” of filtering for dynamic data</li><li>Real-world integration in <strong>WiredTiger</strong> (the MongoDB storage engine)</li><li>Future challenges in <strong>data distribution smoothing</strong> and <strong>hybrid filtering</strong></li></ul><p><br></p><p>Whether you're a <strong>database engineer</strong>, <strong>systems researcher</strong>, or <strong>student</strong> exploring data structures, this episode reveals how cutting-edge research can transform how we query, filter, and scale modern data systems.</p><br><p>Links:</p><ul><li><a href="https://www.vldb.org/pvldb/vol18/p3923-eslami.pdf" rel="noopener noreferrer" target="_blank">Diva: Dynamic Range Filter for Var-Length Keys and Queries [VLDB'25]</a></li><li><a href="https://github.com/n3slami/Diva" rel="noopener noreferrer" target="_blank">Diva on GitHub</a></li><li><a href="https://www.linkedin.com/in/navid-eslami-14036823a/?originalSubdomain=ir" rel="noopener noreferrer" target="_blank">Navid's LinkedIn</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode of <em>Disseminate: The Computer Science Research Podcast</em>, Jack sits down with <strong>Navid Eslami</strong>, PhD researcher at the <strong>University of Toronto</strong>, to discuss his award-winning paper <strong>“DIVA: Dynamic Range Filter for Variable Length Keys and Queries”</strong>, which earned <strong>Best Research Paper at VLDB</strong>.</p><br><p>Navid breaks down how <strong>range filters</strong> extend the power of traditional filters for modern databases and storage systems, enabling <strong>faster queries, better scalability, and theoretical guarantees</strong>. We dive into:</p><ul><li>How <strong>DIVA</strong> overcomes the limitations of existing range filters</li><li>What makes it the “holy grail” of filtering for dynamic data</li><li>Real-world integration in <strong>WiredTiger</strong> (the MongoDB storage engine)</li><li>Future challenges in <strong>data distribution smoothing</strong> and <strong>hybrid filtering</strong></li></ul><p><br></p><p>Whether you're a <strong>database engineer</strong>, <strong>systems researcher</strong>, or <strong>student</strong> exploring data structures, this episode reveals how cutting-edge research can transform how we query, filter, and scale modern data systems.</p><br><p>Links:</p><ul><li><a href="https://www.vldb.org/pvldb/vol18/p3923-eslami.pdf" rel="noopener noreferrer" target="_blank">Diva: Dynamic Range Filter for Var-Length Keys and Queries [VLDB'25]</a></li><li><a href="https://github.com/n3slami/Diva" rel="noopener noreferrer" target="_blank">Diva on GitHub</a></li><li><a href="https://www.linkedin.com/in/navid-eslami-14036823a/?originalSubdomain=ir" rel="noopener noreferrer" target="_blank">Navid's LinkedIn</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Adaptive Factorization in DuckDB with Paul Groß</title>
			<itunes:title>Adaptive Factorization in DuckDB with Paul Groß</itunes:title>
			<pubDate>Thu, 06 Nov 2025 05:00:00 GMT</pubDate>
			<itunes:duration>51:15</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/690bcb8b68ccec9b8e112f77/media.mp3" length="52102043" type="audio/mpeg"/>
			<guid isPermaLink="false">690bcb8b68ccec9b8e112f77</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://vldb.org/cidrdb/papers/2025/p21-gro.pdf</link>
			<acast:episodeId>690bcb8b68ccec9b8e112f77</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>adaptive-factorization-in-duckdb-with-paul-gro</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4KMm0mU6AThyWLFlDflLDaICnFz655ujF+Lfe5plJHAYX7RVG+LUwQi5p69gyTY7dB6KtzD13rcCL1131jOkU9F]]></acast:settings>
			<itunes:subtitle>DuckDB in Research</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>7</itunes:season>
			<itunes:episode>15</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1762380490349-1b32aa19-34a6-4e65-a982-efa7cfffdb6d.jpeg"/>
			<description><![CDATA[<p>In this episode of the <em>DuckDB in Research series</em>, host Jack Waudby sits down with Paul Groß, PhD student at CWI Amsterdam, to explore his work on adaptive factorization and worst-case optimal joins - techniques that push the boundaries of analytical query performance.</p><br><p>Paul shares insights from his CIDR'25 paper <em>“Adaptive Factorization Using Linear Chained Hash Tables”</em>, revealing how decades of database theory meet modern, practical system design in DuckDB. From hash table internals to adaptive query planning, this episode uncovers how research innovations are becoming part of real-world systems.</p><br><p>Whether you’re a database researcher, engineer, or curious student, you’ll come away with a deeper understanding of query optimization and the realities of systems engineering.</p><br><p>Links:</p><ul><li><a href="https://vldb.org/cidrdb/papers/2025/p21-gro.pdf" rel="noopener noreferrer" target="_blank">Adaptive Factorization Using Linear-Chained Hash Tables</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode of the <em>DuckDB in Research series</em>, host Jack Waudby sits down with Paul Groß, PhD student at CWI Amsterdam, to explore his work on adaptive factorization and worst-case optimal joins - techniques that push the boundaries of analytical query performance.</p><br><p>Paul shares insights from his CIDR'25 paper <em>“Adaptive Factorization Using Linear Chained Hash Tables”</em>, revealing how decades of database theory meet modern, practical system design in DuckDB. From hash table internals to adaptive query planning, this episode uncovers how research innovations are becoming part of real-world systems.</p><br><p>Whether you’re a database researcher, engineer, or curious student, you’ll come away with a deeper understanding of query optimization and the realities of systems engineering.</p><br><p>Links:</p><ul><li><a href="https://vldb.org/cidrdb/papers/2025/p21-gro.pdf" rel="noopener noreferrer" target="_blank">Adaptive Factorization Using Linear-Chained Hash Tables</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Parachute: Rethinking Query Execution and Bidirectional Information Flow in DuckDB - with Mihail Stoian</title>
			<itunes:title>Parachute: Rethinking Query Execution and Bidirectional Information Flow in DuckDB - with Mihail Stoian</itunes:title>
			<pubDate>Thu, 30 Oct 2025 05:00:00 GMT</pubDate>
			<itunes:duration>36:34</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/6900a00baee65e114aa29640/media.mp3" length="35118290" type="audio/mpeg"/>
			<guid isPermaLink="false">6900a00baee65e114aa29640</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/parachute</link>
			<acast:episodeId>6900a00baee65e114aa29640</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>parachute</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4KohQAIHKZ5EiEbmzDoJyb+hnNZVdH6ORyGn+cPBa/YvsBgMZhgUbH+Xb7kIb0CV342mmiCYpc574jPggJNa1TV]]></acast:settings>
			<itunes:subtitle>DuckDB in Research</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>7</itunes:season>
			<itunes:episode>14</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1761926893504-b6635f57-ba6d-44d8-8217-6fd76567a9d4.jpeg"/>
			<description><![CDATA[<p>In this episode of the <em>DuckDB in Research </em>series, host <strong>Jack Waudby</strong> sits down with <strong>Mihail Stoian</strong>, PhD student at the <strong>Data Systems Lab, University of Technology Nuremberg</strong>, to unpack the cutting-edge ideas behind <strong>Parachute</strong>, a new approach to robust query processing and bidirectional information passing in modern analytical databases.</p><br><p>We explore how <strong>Parachute bridges theory and practice</strong>, combining concepts from instance-optimal algorithms and semi-join filtering to boost performance in <strong>DuckDB</strong>, the in-process analytical SQL engine that’s reshaping how research meets real-world data systems.</p><br><p>Mihail discusses:</p><ul><li>How <em>Parachute</em> extends semi-join filtering for two-way information flow</li><li>The challenges of implementing research ideas inside DuckDB</li><li>Practical performance gains on TPC-H and CEB workloads</li><li>The future of adaptive query processing and research-driven system design</li></ul><p><br></p><p>Whether you're a <strong>database researcher</strong>, <strong>systems engineer</strong>, or <strong>curious practitioner</strong>, this deep-dive reveals how academic innovation continues to shape modern data infrastructure.</p><br><p>Links:</p><ul><li><a href="https://www.arxiv.org/pdf/2506.13670" rel="noopener noreferrer" target="_blank">Parachute: Single-Pass Bi-Directional Information Passing VLDB 2025 Paper</a></li><li><a href="https://stoianmihail.github.io/" rel="noopener noreferrer" target="_blank">Mihail's homepage</a></li><li><a href="https://github.com/utndatasystems/parachute" rel="noopener noreferrer" target="_blank">Parachute's Github repo</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode of the <em>DuckDB in Research </em>series, host <strong>Jack Waudby</strong> sits down with <strong>Mihail Stoian</strong>, PhD student at the <strong>Data Systems Lab, University of Technology Nuremberg</strong>, to unpack the cutting-edge ideas behind <strong>Parachute</strong>, a new approach to robust query processing and bidirectional information passing in modern analytical databases.</p><br><p>We explore how <strong>Parachute bridges theory and practice</strong>, combining concepts from instance-optimal algorithms and semi-join filtering to boost performance in <strong>DuckDB</strong>, the in-process analytical SQL engine that’s reshaping how research meets real-world data systems.</p><br><p>Mihail discusses:</p><ul><li>How <em>Parachute</em> extends semi-join filtering for two-way information flow</li><li>The challenges of implementing research ideas inside DuckDB</li><li>Practical performance gains on TPC-H and CEB workloads</li><li>The future of adaptive query processing and research-driven system design</li></ul><p><br></p><p>Whether you're a <strong>database researcher</strong>, <strong>systems engineer</strong>, or <strong>curious practitioner</strong>, this deep-dive reveals how academic innovation continues to shape modern data infrastructure.</p><br><p>Links:</p><ul><li><a href="https://www.arxiv.org/pdf/2506.13670" rel="noopener noreferrer" target="_blank">Parachute: Single-Pass Bi-Directional Information Passing VLDB 2025 Paper</a></li><li><a href="https://stoianmihail.github.io/" rel="noopener noreferrer" target="_blank">Mihail's homepage</a></li><li><a href="https://github.com/utndatasystems/parachute" rel="noopener noreferrer" target="_blank">Parachute's Github repo</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Anarchy in the Database: Abigale Kim on DuckDB and DBMS Extensibility</title>
			<itunes:title>Anarchy in the Database: Abigale Kim on DuckDB and DBMS Extensibility</itunes:title>
			<pubDate>Thu, 23 Oct 2025 05:00:00 GMT</pubDate>
			<itunes:duration>46:24</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/68f7fbd763b81f879d5532a4/media.mp3" length="44559144" type="audio/mpeg"/>
			<guid isPermaLink="false">68f7fbd763b81f879d5532a4</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/anarchy-in-the-database-abigale-kim-on-dbms-extensibility</link>
			<acast:episodeId>68f7fbd763b81f879d5532a4</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>anarchy-in-the-database-abigale-kim-on-dbms-extensibility</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4KRTdb5hn+s5DNzJl1DDvfiPeVpXs/Cn8POlD6iXTlP5bcZrmJSzsyNwlFeVr0nGZb45ZbOQc9WD2kKEALuajmt]]></acast:settings>
			<itunes:subtitle>DuckDB in Research</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>7</itunes:season>
			<itunes:episode>13</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1761156515511-7d3458a6-9370-4b0e-be29-1d8d75cf3923.jpeg"/>
			<description><![CDATA[<p>In this episode of the <em>DuckDB in Research series</em>, host <strong>Jack Waudby</strong> talks with <strong>Abigale Kim</strong>, PhD student at the University of Wisconsin–Madison and author of VLDB 2025 paper<strong>: </strong><em>“Anarchy in the Database: A Survey and Evaluation of DBMS Extensibility”. </em>They explore how database extensibility is reshaping modern data systems — and why <strong>DuckDB</strong> is emerging as the gold standard for safe, flexible, and high-performance extensions. Abigale shares the inside story of her research, the surprises uncovered when testing Postgres and DuckDB extensions, and what’s next for extensibility and composable database design.</p><br><p>This episode is perfect for <strong>researchers, practitioners, and students</strong> interested in databases, systems design, and the interplay between academia and industry innovation.</p><br><p>Highlights:</p><ul><li>What “extensibility” really means in a DBMS</li><li>How DuckDB compares to Postgres, MySQL, and Redis</li><li>The rise of GPU-accelerated DuckDB extensions</li><li>Why bridging research and engineering matters for the future of databases</li></ul><p><br></p><p>Links:</p><ul><li><a href="https://www.vldb.org/pvldb/vol18/p1962-kim.pdf" rel="noopener noreferrer" target="_blank">Anarchy in the Database: A Survey and Evaluation of Database Management System Extensibility VLDB 2025</a></li><li><a href="https://arxiv.org/pdf/2508.04701" rel="noopener noreferrer" target="_blank">Rethinking Analytical Processing in the GPU Era</a></li></ul><p><br></p><p>You can find Abigale at:</p><ul><li><a href="https://x.com/abigale_kim" rel="noopener noreferrer" target="_blank">X</a></li><li><a href="https://bsky.app/profile/abigalekim.bsky.social" rel="noopener noreferrer" target="_blank">Bluesky</a></li><li><a href="https://abigalekim.github.io/" rel="noopener noreferrer" target="_blank">Personal site</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode of the <em>DuckDB in Research series</em>, host <strong>Jack Waudby</strong> talks with <strong>Abigale Kim</strong>, PhD student at the University of Wisconsin–Madison and author of VLDB 2025 paper<strong>: </strong><em>“Anarchy in the Database: A Survey and Evaluation of DBMS Extensibility”. </em>They explore how database extensibility is reshaping modern data systems — and why <strong>DuckDB</strong> is emerging as the gold standard for safe, flexible, and high-performance extensions. Abigale shares the inside story of her research, the surprises uncovered when testing Postgres and DuckDB extensions, and what’s next for extensibility and composable database design.</p><br><p>This episode is perfect for <strong>researchers, practitioners, and students</strong> interested in databases, systems design, and the interplay between academia and industry innovation.</p><br><p>Highlights:</p><ul><li>What “extensibility” really means in a DBMS</li><li>How DuckDB compares to Postgres, MySQL, and Redis</li><li>The rise of GPU-accelerated DuckDB extensions</li><li>Why bridging research and engineering matters for the future of databases</li></ul><p><br></p><p>Links:</p><ul><li><a href="https://www.vldb.org/pvldb/vol18/p1962-kim.pdf" rel="noopener noreferrer" target="_blank">Anarchy in the Database: A Survey and Evaluation of Database Management System Extensibility VLDB 2025</a></li><li><a href="https://arxiv.org/pdf/2508.04701" rel="noopener noreferrer" target="_blank">Rethinking Analytical Processing in the GPU Era</a></li></ul><p><br></p><p>You can find Abigale at:</p><ul><li><a href="https://x.com/abigale_kim" rel="noopener noreferrer" target="_blank">X</a></li><li><a href="https://bsky.app/profile/abigalekim.bsky.social" rel="noopener noreferrer" target="_blank">Bluesky</a></li><li><a href="https://abigalekim.github.io/" rel="noopener noreferrer" target="_blank">Personal site</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Recursive CTEs, Trampolines, and Teaching Databases with DuckDB - with Prof. Torsten Grust</title>
			<itunes:title>Recursive CTEs, Trampolines, and Teaching Databases with DuckDB - with Prof. Torsten Grust</itunes:title>
			<pubDate>Thu, 16 Oct 2025 17:30:00 GMT</pubDate>
			<itunes:duration>51:05</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/68efef71c68aefb908a1cf89/media.mp3" length="49046776" type="audio/mpeg"/>
			<guid isPermaLink="false">68efef71c68aefb908a1cf89</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/torsten-grust</link>
			<acast:episodeId>68efef71c68aefb908a1cf89</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>torsten-grust</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4Lq+v+L9i9xySSiIYDqZqRKzF5yAogWG2PEMCEJzJ64RHU07lc7YsJxKNXh1LBdMPcI4c7e5dxEFtEOIEDzrop0]]></acast:settings>
			<itunes:subtitle>DuckDB in Research</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>7</itunes:season>
			<itunes:episode>12</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1760781402789-63d45494-f58a-4aaf-9ca9-77e974e0e8da.jpeg"/>
			<description><![CDATA[<p>In this episode of the <em>DuckDB in Research series</em>, host Dr Jack Waudby talks with<strong> Professor Torsten Grust</strong> from the University of Tübingen. Torsten is one of the pioneers behind DuckDB’s implementation of recursive CTEs.</p><br><p>In the episode they unpack:</p><ul><li>The power of <strong>recursive CTEs</strong> and how they turn SQL into a full-fledged programming language.</li><li>The story behind <strong>adding recursion to DuckDB</strong>, including the <em>using key</em> feature and the <em>trampoline</em> and <em>TTL</em> extensions emerging from Torsten’s lab.</li><li>How these ideas are transforming research, teaching, and even DuckDB’s internal architecture.</li><li>Why <strong>DuckDB makes databases exciting again</strong> — from classroom to cutting-edge systems research.</li></ul><p>If you’re into <strong>data systems, query processing, or bridging research and practice</strong>, this episode is for you.</p><br><p>Links:</p><ul><li><a href="https://duckdb.org/2025/05/23/using-key" rel="noopener noreferrer" target="_blank">USING KEY in Recursive CTEs</a></li><li><a href="https://dl.acm.org/doi/pdf/10.1145/3722212.3725107" rel="noopener noreferrer" target="_blank">How DuckDB is USING KEY to Unlock Recursive Query Performance</a></li><li><a href="https://mail.vldb.org/cidrdb/papers/2025/p1-lambrecht.pdf" rel="noopener noreferrer" target="_blank">Trampoline-Style Queries for SQL</a></li><li><a href="https://github.com/DBatUTuebingen/Advent_of_Code" rel="noopener noreferrer" target="_blank">U Tübingen Advent of code</a></li><li><a href="https://db.cs.uni-tuebingen.de/publications/2023/a-fix-for-the-fixation-on-fixpoints/a-fix-for-the-fixation-on-fixpoints.pdf" rel="noopener noreferrer" target="_blank">A Fix for the Fixation on Fixpoints</a></li><li><a href="https://dl.acm.org/doi/abs/10.1145/3448016.3457272" rel="noopener noreferrer" target="_blank">One WITH RECURSIVE is Worth Many GOTOs</a></li><li><a href="https://db.cs.uni-tuebingen.de/team/members/torsten-grust/" rel="noopener noreferrer" target="_blank">Torsten's homepage</a></li><li><a href="https://x.com/teggy?lang=en-GB" rel="noopener noreferrer" target="_blank">Torsten's X</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode of the <em>DuckDB in Research series</em>, host Dr Jack Waudby talks with<strong> Professor Torsten Grust</strong> from the University of Tübingen. Torsten is one of the pioneers behind DuckDB’s implementation of recursive CTEs.</p><br><p>In the episode they unpack:</p><ul><li>The power of <strong>recursive CTEs</strong> and how they turn SQL into a full-fledged programming language.</li><li>The story behind <strong>adding recursion to DuckDB</strong>, including the <em>using key</em> feature and the <em>trampoline</em> and <em>TTL</em> extensions emerging from Torsten’s lab.</li><li>How these ideas are transforming research, teaching, and even DuckDB’s internal architecture.</li><li>Why <strong>DuckDB makes databases exciting again</strong> — from classroom to cutting-edge systems research.</li></ul><p>If you’re into <strong>data systems, query processing, or bridging research and practice</strong>, this episode is for you.</p><br><p>Links:</p><ul><li><a href="https://duckdb.org/2025/05/23/using-key" rel="noopener noreferrer" target="_blank">USING KEY in Recursive CTEs</a></li><li><a href="https://dl.acm.org/doi/pdf/10.1145/3722212.3725107" rel="noopener noreferrer" target="_blank">How DuckDB is USING KEY to Unlock Recursive Query Performance</a></li><li><a href="https://mail.vldb.org/cidrdb/papers/2025/p1-lambrecht.pdf" rel="noopener noreferrer" target="_blank">Trampoline-Style Queries for SQL</a></li><li><a href="https://github.com/DBatUTuebingen/Advent_of_Code" rel="noopener noreferrer" target="_blank">U Tübingen Advent of code</a></li><li><a href="https://db.cs.uni-tuebingen.de/publications/2023/a-fix-for-the-fixation-on-fixpoints/a-fix-for-the-fixation-on-fixpoints.pdf" rel="noopener noreferrer" target="_blank">A Fix for the Fixation on Fixpoints</a></li><li><a href="https://dl.acm.org/doi/abs/10.1145/3448016.3457272" rel="noopener noreferrer" target="_blank">One WITH RECURSIVE is Worth Many GOTOs</a></li><li><a href="https://db.cs.uni-tuebingen.de/team/members/torsten-grust/" rel="noopener noreferrer" target="_blank">Torsten's homepage</a></li><li><a href="https://x.com/teggy?lang=en-GB" rel="noopener noreferrer" target="_blank">Torsten's X</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>DuckDB in Research S2 Coming Soon!</title>
			<itunes:title>DuckDB in Research S2 Coming Soon!</itunes:title>
			<pubDate>Thu, 16 Oct 2025 12:35:10 GMT</pubDate>
			<itunes:duration>2:06</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/68f0e67e2c8b779d1df91288/media.mp3" length="2025116" type="audio/mpeg"/>
			<guid isPermaLink="false">68f0e67e2c8b779d1df91288</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/duckdb-in-research-s2-coming-soon</link>
			<acast:episodeId>68f0e67e2c8b779d1df91288</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>duckdb-in-research-s2-coming-soon</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4JoCOTNsmOtJFQgDwxfrLMeYZpEIdZCtYovGwo9Ehe8IhhntsrUG0RyXWyqLYU1zYsfXF45iAFePpoiN7O0WGLZ]]></acast:settings>
			<itunes:subtitle>DuckDB in Research</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>7</itunes:season>
			<itunes:episode>11</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1760617367711-a02e4cce-dde2-4737-9aa6-73d2432774e8.jpeg"/>
			<description><![CDATA[<p>Hey folks! The DuckDB in Research series is back for S2!</p><br><p>In this season we chat with:</p><ul><li><strong>Torsten Grust:</strong> Recursive CTEs</li><li><strong>Abigale Kim:</strong> <a href="https://www.vldb.org/pvldb/vol18/p1962-kim.pdf" rel="noopener noreferrer" target="_blank">Anarchy in the Database</a></li><li><strong>Mihail Stoian:</strong> <a href="https://www.arxiv.org/pdf/2506.13670" rel="noopener noreferrer" target="_blank">Parachute: Single-Pass Bi-Directional Information Passing</a></li><li><strong>Paul Gross:</strong> <a href="https://vldb.org/cidrdb/papers/2025/p21-gro.pdf" rel="noopener noreferrer" target="_blank">Adaptive Factorization Using Linear-Chained Hash Tables</a></li></ul><p><br></p><p>Whether you're a researcher, engineer, or just curious about the intersection of databases and innovation we are sure you will love this series.</p><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>Hey folks! The DuckDB in Research series is back for S2!</p><br><p>In this season we chat with:</p><ul><li><strong>Torsten Grust:</strong> Recursive CTEs</li><li><strong>Abigale Kim:</strong> <a href="https://www.vldb.org/pvldb/vol18/p1962-kim.pdf" rel="noopener noreferrer" target="_blank">Anarchy in the Database</a></li><li><strong>Mihail Stoian:</strong> <a href="https://www.arxiv.org/pdf/2506.13670" rel="noopener noreferrer" target="_blank">Parachute: Single-Pass Bi-Directional Information Passing</a></li><li><strong>Paul Gross:</strong> <a href="https://vldb.org/cidrdb/papers/2025/p21-gro.pdf" rel="noopener noreferrer" target="_blank">Adaptive Factorization Using Linear-Chained Hash Tables</a></li></ul><p><br></p><p>Whether you're a researcher, engineer, or just curious about the intersection of databases and innovation we are sure you will love this series.</p><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title><![CDATA[Rohan Padhye & Ao Li | Fray: An Efficient General-Purpose Concurrency JVM Testing Platform | #66]]></title>
			<itunes:title><![CDATA[Rohan Padhye & Ao Li | Fray: An Efficient General-Purpose Concurrency JVM Testing Platform | #66]]></itunes:title>
			<pubDate>Mon, 06 Oct 2025 07:07:31 GMT</pubDate>
			<itunes:duration>58:45</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/68e13998965488b63a4b4f15/media.mp3" length="56410809" type="audio/mpeg"/>
			<guid isPermaLink="false">68e13998965488b63a4b4f15</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://arxiv.org/pdf/2501.12618</link>
			<acast:episodeId>68e13998965488b63a4b4f15</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>rohan-padhye-ao-li-fray</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4LVs+XU9ODzwVyTbCbMP4XdBHyM+O/h67NZLPtDaK2bdG7/JPLeq1sbUpZjQ9Fn+leA3cTcwGo7lOK/2B8jFNyC]]></acast:settings>
			<itunes:subtitle>Cutting Edge</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>26</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1759590617551-f108eacb-33a5-433c-9dc7-f5adc6c9be50.jpeg"/>
			<description><![CDATA[<p>In this episode of Disseminate: The Computer Science Research Podcast, guest host Bogdan Stoica sits down with Ao Li and Rohan Padhye (Carnegie Mellon University) to discuss their OOPSLA 2025 paper: "Fray: An Efficient General-Purpose Concurrency Testing Platform for the JVM".</p><br><p>We dive into:</p><ul><li>Why concurrency bugs remain so hard to catch -- even in "well-tested" Java projects.</li><li>The design of Fray, a new concurrency testing platform that outperforms prior tools like JPF and rr.</li><li>Real-world bugs discovered in Apache Kafka, Lucene, and Google Guava.</li><li>The gap between academic research and industrial practice, and how Fray bridges it.</li><li>What’s next for concurrency testing: debugging tools, distributed systems, and beyond.</li></ul><p><br></p><p>If you’re a Java developer, systems researcher, or just curious about how to make software more reliable, this conversation is packed with insights on the future of software testing.</p><br><p>Links &amp; Resources:</p><p>- <a href="https://arxiv.org/pdf/2501.12618" rel="noopener noreferrer" target="_blank">The Fray paper (OOPSLA 2025)</a>:</p><p>- <a href="https://github.com/cmu-pasta/fray" rel="noopener noreferrer" target="_blank">Fray on GitHub</a></p><p>- <a href="https://aoli.al/" rel="noopener noreferrer" target="_blank">Ao Li’s research</a> </p><p>- <a href="https://rohan.padhye.org/" rel="noopener noreferrer" target="_blank">Rohan Padhye’s research</a> </p><br><p>Don’t forget to like, subscribe, and hit the 🔔 to stay updated on the latest episodes about cutting-edge computer science research.</p><br><p>#Java #Concurrency #SoftwareTesting #Fray #OOPSLA2025 #Programming #Debugging #JVM #ComputerScience #ResearchPodcast</p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode of Disseminate: The Computer Science Research Podcast, guest host Bogdan Stoica sits down with Ao Li and Rohan Padhye (Carnegie Mellon University) to discuss their OOPSLA 2025 paper: "Fray: An Efficient General-Purpose Concurrency Testing Platform for the JVM".</p><br><p>We dive into:</p><ul><li>Why concurrency bugs remain so hard to catch -- even in "well-tested" Java projects.</li><li>The design of Fray, a new concurrency testing platform that outperforms prior tools like JPF and rr.</li><li>Real-world bugs discovered in Apache Kafka, Lucene, and Google Guava.</li><li>The gap between academic research and industrial practice, and how Fray bridges it.</li><li>What’s next for concurrency testing: debugging tools, distributed systems, and beyond.</li></ul><p><br></p><p>If you’re a Java developer, systems researcher, or just curious about how to make software more reliable, this conversation is packed with insights on the future of software testing.</p><br><p>Links &amp; Resources:</p><p>- <a href="https://arxiv.org/pdf/2501.12618" rel="noopener noreferrer" target="_blank">The Fray paper (OOPSLA 2025)</a>:</p><p>- <a href="https://github.com/cmu-pasta/fray" rel="noopener noreferrer" target="_blank">Fray on GitHub</a></p><p>- <a href="https://aoli.al/" rel="noopener noreferrer" target="_blank">Ao Li’s research</a> </p><p>- <a href="https://rohan.padhye.org/" rel="noopener noreferrer" target="_blank">Rohan Padhye’s research</a> </p><br><p>Don’t forget to like, subscribe, and hit the 🔔 to stay updated on the latest episodes about cutting-edge computer science research.</p><br><p>#Java #Concurrency #SoftwareTesting #Fray #OOPSLA2025 #Programming #Debugging #JVM #ComputerScience #ResearchPodcast</p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title><![CDATA[Shrey Tiwari | It's About Time: A Study of Date and Time Bugs in Python Software | #65]]></title>
			<itunes:title><![CDATA[Shrey Tiwari | It's About Time: A Study of Date and Time Bugs in Python Software | #65]]></itunes:title>
			<pubDate>Tue, 23 Sep 2025 07:55:51 GMT</pubDate>
			<itunes:duration>1:05:29</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/68d2528702bd591597816b41/media.mp3" length="62874529" type="audio/mpeg"/>
			<guid isPermaLink="false">68d2528702bd591597816b41</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://rohan.padhye.org/files/datetimebugs-msr25.pdf</link>
			<acast:episodeId>68d2528702bd591597816b41</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>shrey-tiwari-its-about-time</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4Liw9+urBJgwbzV23PgVksmhQAGXZuGW2WaVi3pUi+unV9Pmh2nwT1xMuGOareyyzlBMikU/Y5WtQsij77KfdpJ]]></acast:settings>
			<itunes:subtitle>Cutting Edge</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>25</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1758613369570-f5344f25-bbee-4fdc-ab60-973d505421d6.jpeg"/>
			<description><![CDATA[<p>In this episode, Bogdan Stoica, Postdoctoral Research Associate in the SysNet group at the University of Illinois Urbana-Champaign (UIUC) steps in to guest host. Bogdan sits down with Shrey Tiwari, a PhD student in the Software and Societal Systems Department at Carnegie Mellon University and member of the PASTA Lab, advised by Prof. Rohan Padhye. Together, they dive into Shrey’s award-winning research on date and time bugs in open-source Python software, exploring why these issues are so deceptively tricky and how they continue to affect systems we rely on every day.</p><br><p>The conversation traces Shrey’s journey from industry to research, including formative experiences at Citrix and Microsoft Research, and how those shaped his passion for software reliability. Shrey and Bogdan discuss the surprising complexity of date and time handling, the methodology behind Shrey’s empirical study, and the practical lessons developers can take away to build more robust systems. Along the way, they highlight broader questions about testing, bug detection, and the future role of AI in ensuring software correctness. This episode is a must-listen for anyone interested in debugging, reliability, and the hidden challenges that underpin modern software.</p><br><p>Links:</p><ul><li><a href="https://rohan.padhye.org/files/datetimebugs-msr25.pdf" rel="noopener noreferrer" target="_blank">It’s About Time: An Empirical Study of Date and Time Bugs in Open-Source Python Software</a> 🏆&nbsp;<strong>ACM SIGSOFT Distinguished Paper Award</strong></li><li><a href="https://www.shreytiwari.com/" rel="noopener noreferrer" target="_blank">Shrey's homepage</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode, Bogdan Stoica, Postdoctoral Research Associate in the SysNet group at the University of Illinois Urbana-Champaign (UIUC) steps in to guest host. Bogdan sits down with Shrey Tiwari, a PhD student in the Software and Societal Systems Department at Carnegie Mellon University and member of the PASTA Lab, advised by Prof. Rohan Padhye. Together, they dive into Shrey’s award-winning research on date and time bugs in open-source Python software, exploring why these issues are so deceptively tricky and how they continue to affect systems we rely on every day.</p><br><p>The conversation traces Shrey’s journey from industry to research, including formative experiences at Citrix and Microsoft Research, and how those shaped his passion for software reliability. Shrey and Bogdan discuss the surprising complexity of date and time handling, the methodology behind Shrey’s empirical study, and the practical lessons developers can take away to build more robust systems. Along the way, they highlight broader questions about testing, bug detection, and the future role of AI in ensuring software correctness. This episode is a must-listen for anyone interested in debugging, reliability, and the hidden challenges that underpin modern software.</p><br><p>Links:</p><ul><li><a href="https://rohan.padhye.org/files/datetimebugs-msr25.pdf" rel="noopener noreferrer" target="_blank">It’s About Time: An Empirical Study of Date and Time Bugs in Open-Source Python Software</a> 🏆&nbsp;<strong>ACM SIGSOFT Distinguished Paper Award</strong></li><li><a href="https://www.shreytiwari.com/" rel="noopener noreferrer" target="_blank">Shrey's homepage</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Lessons Learned from Five Years of Artifact Evaluations at EuroSys | #64</title>
			<itunes:title>Lessons Learned from Five Years of Artifact Evaluations at EuroSys | #64</itunes:title>
			<pubDate>Wed, 30 Jul 2025 11:00:00 GMT</pubDate>
			<itunes:duration>43:48</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/68892dd469e88bb0850efc1a/media.mp3" length="42054745" type="audio/mpeg"/>
			<guid isPermaLink="false">68892dd469e88bb0850efc1a</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://dx.doi.org/10.1145/3736731.3746152</link>
			<acast:episodeId>68892dd469e88bb0850efc1a</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>lessons-learned-from-five-years-of-artifact-evaluations-at-e</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4K+JeNgLrdCDT4iL9nZvjrQBcjrQQA47k34tbWSu7W5HIvgdRvrOKt9vL1cD9sxIUSXugom/J8k9zC4nDgMrRuE]]></acast:settings>
			<itunes:subtitle>Cutting Edge</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>24</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1753819690955-f105dcb6-809c-45e4-90d7-9194dd7895c8.jpeg"/>
			<description><![CDATA[<p>In this episode we are joined by Thaleia Doudali, Miguel Matos, and Anjo Vahldiek-Oberwagner to delve into five years of experience managing artifact evaluation at the EuroSys conference. They explain the goals and mechanics of artifact evaluation, a voluntary process that encourages reproducibility and reusability in computer systems research by assessing the supporting code, data, and documentation of accepted papers. The conversation outlines the three-tiered badge system, the multi-phase review process, and the importance of open-source practices. The guests present data showing increasing participation, sustained artifact availability, and varying levels of community engagement, underscoring the growing relevance of artifacts in validating and extending research.</p><br><p>The discussion also highlights recurring challenges such as tight timelines between paper acceptance and camera-ready deadlines, disparities in expectations between main program and artifact committees, difficulties with specialized hardware requirements, and lack of institutional continuity among evaluators. To address these, the guests propose early artifact preparation, stronger integration across committees, formalization of evaluation guidelines, and possibly making artifact submission mandatory. They advocate for broader standardization across CS subfields and suggest introducing a “Test of Time” award for artifacts. Looking to the future, they envision a more scalable, consistent, and impactful artifact evaluation process—but caution that continued growth in paper volume will demand innovation to maintain quality and reviewer sustainability.</p><br><p>Links:</p><ul><li><a href="https://nebelwelt.net/files/25REP.pdf" rel="noopener noreferrer" target="_blank">Lessons Learned from Five Years of Artifact Evaluations at EuroSys</a> [<a href="https://dx.doi.org/10.1145/3736731.3746152" rel="noopener noreferrer" target="_blank">DOI</a>] </li><li><a href="https://thaleia-dimitradoudali.github.io/" rel="noopener noreferrer" target="_blank">Thaleia's Homepage</a></li><li><a href="https://vahldiek.github.io/" rel="noopener noreferrer" target="_blank">Anjo's Homepage</a></li><li><a href="https://miguelmatos.me/" rel="noopener noreferrer" target="_blank">Miguel's Homepage</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode we are joined by Thaleia Doudali, Miguel Matos, and Anjo Vahldiek-Oberwagner to delve into five years of experience managing artifact evaluation at the EuroSys conference. They explain the goals and mechanics of artifact evaluation, a voluntary process that encourages reproducibility and reusability in computer systems research by assessing the supporting code, data, and documentation of accepted papers. The conversation outlines the three-tiered badge system, the multi-phase review process, and the importance of open-source practices. The guests present data showing increasing participation, sustained artifact availability, and varying levels of community engagement, underscoring the growing relevance of artifacts in validating and extending research.</p><br><p>The discussion also highlights recurring challenges such as tight timelines between paper acceptance and camera-ready deadlines, disparities in expectations between main program and artifact committees, difficulties with specialized hardware requirements, and lack of institutional continuity among evaluators. To address these, the guests propose early artifact preparation, stronger integration across committees, formalization of evaluation guidelines, and possibly making artifact submission mandatory. They advocate for broader standardization across CS subfields and suggest introducing a “Test of Time” award for artifacts. Looking to the future, they envision a more scalable, consistent, and impactful artifact evaluation process—but caution that continued growth in paper volume will demand innovation to maintain quality and reviewer sustainability.</p><br><p>Links:</p><ul><li><a href="https://nebelwelt.net/files/25REP.pdf" rel="noopener noreferrer" target="_blank">Lessons Learned from Five Years of Artifact Evaluations at EuroSys</a> [<a href="https://dx.doi.org/10.1145/3736731.3746152" rel="noopener noreferrer" target="_blank">DOI</a>] </li><li><a href="https://thaleia-dimitradoudali.github.io/" rel="noopener noreferrer" target="_blank">Thaleia's Homepage</a></li><li><a href="https://vahldiek.github.io/" rel="noopener noreferrer" target="_blank">Anjo's Homepage</a></li><li><a href="https://miguelmatos.me/" rel="noopener noreferrer" target="_blank">Miguel's Homepage</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Dominik Winterer | Validating SMT Solvers for Correctness and Performance via Grammar-based Enumeration | #63</title>
			<itunes:title>Dominik Winterer | Validating SMT Solvers for Correctness and Performance via Grammar-based Enumeration | #63</itunes:title>
			<pubDate>Fri, 25 Jul 2025 13:57:09 GMT</pubDate>
			<itunes:duration>43:38</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/68838d356e658a8b3c8c4ccf/media.mp3" length="41901764" type="audio/mpeg"/>
			<guid isPermaLink="false">68838d356e658a8b3c8c4ccf</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/dominik-winterer</link>
			<acast:episodeId>68838d356e658a8b3c8c4ccf</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>dominik-winterer</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4Kw2AvzoOWFKX4rS+gS0F0UeSuEpEnHCVhP+5M5iLETtlamPOb/OGRS37Xc86zVJrrE9dTvqRdD6Sbat2El7i2A]]></acast:settings>
			<itunes:subtitle>Cutting Edge</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>23</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1753451010619-51f2ced1-ef12-44ee-8998-04cbf45a4b2d.jpeg"/>
			<description><![CDATA[<p>In this episode of the <em>Disseminate</em> podcast, Dominik Winterer discusses his research on SMT (Satisfiability Modulo Theories) solvers and his recent OOPSLA paper titled <em>"Validating SMT Solvers for Correction and Performance via Grammar Based Enumeration"</em>. Dominik shares his academic journey from the University of Freiburg to ETH Zurich, and now to a lectureship at the University of Manchester. He introduces ET, a tool he developed for exhaustive grammar-based testing of SMT solvers. Unlike traditional fuzzers that use random input generation, ET systematically enumerates small, syntactically valid inputs using context-free grammars to expose bugs more effectively. This approach simplifies bug triage and has revealed over 100 bugs—many of them soundness and performance-related—with a striking number having already been fixed. Dominik emphasizes the tool’s surprising ability to identify deep bugs using minimal input and track solver evolution over time, highlighting ET's potential for continuous integration into CI pipelines.</p><br><p>The conversation then expands into broader reflections on formal methods and the future of software reliability. Dominik advocates for a new discipline—<em>Formal Methods Engineering</em>—to bridge the gap between software engineering and formal verification tools. He stresses the importance of building trustworthy verification tools since the reliability of software increasingly depends on them. Dominik also discusses adapting ET to other domains, such as JavaScript engines, and suggests that grammar-based enumeration can be applied widely to any system with a context-free grammar. Addressing the rise of AI, he envisions validation portfolios that integrate formal methods into LLM-based tooling, offering certified assessments of model outputs. He closes with a call for the community to embrace pragmatic, systematic, and scalable approaches to formal methods to ensure these tools can live up to their promises in real-world development settings.</p><br><p>Links:</p><ul><li><a href="https://wintered.github.io/" rel="noopener noreferrer" target="_blank">Dominik's Homepage</a></li><li><a href="https://dl.acm.org/doi/pdf/10.1145/3689795" rel="noopener noreferrer" target="_blank">Validating SMT Solvers for Correctness and Performance via Grammar-Based Enumeration</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode of the <em>Disseminate</em> podcast, Dominik Winterer discusses his research on SMT (Satisfiability Modulo Theories) solvers and his recent OOPSLA paper titled <em>"Validating SMT Solvers for Correction and Performance via Grammar Based Enumeration"</em>. Dominik shares his academic journey from the University of Freiburg to ETH Zurich, and now to a lectureship at the University of Manchester. He introduces ET, a tool he developed for exhaustive grammar-based testing of SMT solvers. Unlike traditional fuzzers that use random input generation, ET systematically enumerates small, syntactically valid inputs using context-free grammars to expose bugs more effectively. This approach simplifies bug triage and has revealed over 100 bugs—many of them soundness and performance-related—with a striking number having already been fixed. Dominik emphasizes the tool’s surprising ability to identify deep bugs using minimal input and track solver evolution over time, highlighting ET's potential for continuous integration into CI pipelines.</p><br><p>The conversation then expands into broader reflections on formal methods and the future of software reliability. Dominik advocates for a new discipline—<em>Formal Methods Engineering</em>—to bridge the gap between software engineering and formal verification tools. He stresses the importance of building trustworthy verification tools since the reliability of software increasingly depends on them. Dominik also discusses adapting ET to other domains, such as JavaScript engines, and suggests that grammar-based enumeration can be applied widely to any system with a context-free grammar. Addressing the rise of AI, he envisions validation portfolios that integrate formal methods into LLM-based tooling, offering certified assessments of model outputs. He closes with a call for the community to embrace pragmatic, systematic, and scalable approaches to formal methods to ensure these tools can live up to their promises in real-world development settings.</p><br><p>Links:</p><ul><li><a href="https://wintered.github.io/" rel="noopener noreferrer" target="_blank">Dominik's Homepage</a></li><li><a href="https://dl.acm.org/doi/pdf/10.1145/3689795" rel="noopener noreferrer" target="_blank">Validating SMT Solvers for Correctness and Performance via Grammar-Based Enumeration</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Haralampos Gavriilidis | Fast and Scalable Data Transfer across Data Systems | #62</title>
			<itunes:title>Haralampos Gavriilidis | Fast and Scalable Data Transfer across Data Systems | #62</itunes:title>
			<pubDate>Mon, 16 Jun 2025 07:00:00 GMT</pubDate>
			<itunes:duration>56:46</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/684f04734ed3341b0ff1e7da/media.mp3" length="54501504" type="audio/mpeg"/>
			<guid isPermaLink="false">684f04734ed3341b0ff1e7da</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/haralampos-gavriilidis-fast-and-scalable-data-transfer-acros</link>
			<acast:episodeId>684f04734ed3341b0ff1e7da</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>haralampos-gavriilidis-fast-and-scalable-data-transfer-acros</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4ILWC7yJ3Pg2lrEJdjF6uw5B0uEe9LHkwRrlYEgYIYVjfq3GpJomVoMm8ygNJwdnovE4prXwEhzeluCWGjoeDAJ]]></acast:settings>
			<itunes:subtitle>Cutting Edge</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>22</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1750007675146-945bdf2b-ddfc-4e08-a966-121e1d4238d3.jpeg"/>
			<description><![CDATA[<p>In this episode of <em>Disseminate</em>, we welcome Harry Gavrilidis back to the podcast to explore his latest research on fast and scalable data transfer across systems, soon to be presented at SIGMOD 2025. Building on his work with XDB, Harry introduces <strong>XDBC</strong>, a novel data transfer framework designed to balance performance and generalizability. They dive into the challenges of moving data across heterogeneous environments—ranging from cloud systems to IoT devices—and critique the limitations of current generic methods like JDBC and specialized point-to-point connectors.</p><br><p>Harry walks us through the architecture of XDBC, which modularizes the data transfer pipeline into configurable stages like reading, serialization, compression, and networking. The episode highlights how this architecture adapts to varying performance constraints and introduces a cost-based optimizer to automate tuning for different environments. We also touch on future directions, including dynamic reconfiguration, fault tolerance, and learning-based optimizations. If you're interested in systems, performance engineering, or database interoperability, this episode is a must-listen.</p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode of <em>Disseminate</em>, we welcome Harry Gavrilidis back to the podcast to explore his latest research on fast and scalable data transfer across systems, soon to be presented at SIGMOD 2025. Building on his work with XDB, Harry introduces <strong>XDBC</strong>, a novel data transfer framework designed to balance performance and generalizability. They dive into the challenges of moving data across heterogeneous environments—ranging from cloud systems to IoT devices—and critique the limitations of current generic methods like JDBC and specialized point-to-point connectors.</p><br><p>Harry walks us through the architecture of XDBC, which modularizes the data transfer pipeline into configurable stages like reading, serialization, compression, and networking. The episode highlights how this architecture adapts to varying performance constraints and introduces a cost-based optimizer to automate tuning for different environments. We also touch on future directions, including dynamic reconfiguration, fault tolerance, and learning-based optimizations. If you're interested in systems, performance engineering, or database interoperability, this episode is a must-listen.</p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Haralampos Gavriilidis | SheetReader:  Efficient spreadsheet parsing</title>
			<itunes:title>Haralampos Gavriilidis | SheetReader:  Efficient spreadsheet parsing</itunes:title>
			<pubDate>Thu, 17 Apr 2025 08:00:00 GMT</pubDate>
			<itunes:duration>40:53</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/67c56f09c6cef89b7d601ce8/media.mp3" length="39250048" type="audio/mpeg"/>
			<guid isPermaLink="false">67c56f09c6cef89b7d601ce8</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/haralampos-gavriilidis-sheetreader</link>
			<acast:episodeId>67c56f09c6cef89b7d601ce8</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>haralampos-gavriilidis-sheetreader</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4KpNMVz3Tdk+uS0olgtqrQcP9kmRfxhGyjmwrr8QVkjKalXEFGKOcLw+p4sorfXn/gLDwupDSaaPXXel4aUAxB6]]></acast:settings>
			<itunes:subtitle>DuckDB in Research</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>10</itunes:season>
			<itunes:episode>6</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1741340270557-bc5c3ca2-1437-4dab-b0a4-0b1a4c71db92.jpeg"/>
			<description><![CDATA[<p>In this episode of the DuckDB in Research series, Harry Gavriilidis (PhD student at TU Berlin) joins us to discuss <strong>Sheet Reader</strong> — a high-performance spreadsheet parser that dramatically outpaces traditional tools in both speed and memory efficiency. By taking advantage of the standardized structure of spreadsheet files and bypassing generic XML parsers, Sheet Reader delivers fast and lightweight parsing, even on large files. Now available as a DuckDB extension, it enables users to query spreadsheets directly with SQL and integrate them seamlessly into broader analytical workflows.</p><br><p>Harry shares insights into the development process, performance benchmarks, and the surprisingly complex world of spreadsheet parsing. He also discusses community feedback, feature requests (like detecting multiple tables or parsing colored rows), and future plans — including tighter integration with DuckDB and support for Arrow. The conversation wraps up with a look at Harry’s broader research on composable database systems and data interoperability, highlighting how tools like DuckDB are reshaping modern data analysis.</p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode of the DuckDB in Research series, Harry Gavriilidis (PhD student at TU Berlin) joins us to discuss <strong>Sheet Reader</strong> — a high-performance spreadsheet parser that dramatically outpaces traditional tools in both speed and memory efficiency. By taking advantage of the standardized structure of spreadsheet files and bypassing generic XML parsers, Sheet Reader delivers fast and lightweight parsing, even on large files. Now available as a DuckDB extension, it enables users to query spreadsheets directly with SQL and integrate them seamlessly into broader analytical workflows.</p><br><p>Harry shares insights into the development process, performance benchmarks, and the surprisingly complex world of spreadsheet parsing. He also discusses community feedback, feature requests (like detecting multiple tables or parsing colored rows), and future plans — including tighter integration with DuckDB and support for Arrow. The conversation wraps up with a look at Harry’s broader research on composable database systems and data interoperability, highlighting how tools like DuckDB are reshaping modern data analysis.</p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title><![CDATA[Arjen P. de Vries | faiss: An extension for vector data & search]]></title>
			<itunes:title><![CDATA[Arjen P. de Vries | faiss: An extension for vector data & search]]></itunes:title>
			<pubDate>Thu, 10 Apr 2025 11:00:35 GMT</pubDate>
			<itunes:duration>46:14</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/67c56eb0c6cef89b7d600e04/media.mp3" length="44388480" type="audio/mpeg"/>
			<guid isPermaLink="false">67c56eb0c6cef89b7d600e04</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/arjen-p-de-vries-faiss</link>
			<acast:episodeId>67c56eb0c6cef89b7d600e04</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>arjen-p-de-vries-faiss</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4KEC/SuJnbeYPAvSE10X0K0AivEh/ztukB/jkGhiVhxCHCaGkeVkTA3vIHKXOEnOt9QaXJSYG7mNAroqHMUyJdD]]></acast:settings>
			<itunes:subtitle>DuckDB in Research</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>10</itunes:season>
			<itunes:episode>5</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1741340423583-449c13ec-ab3a-46d3-89a4-a24d0ae23fa2.jpeg"/>
			<description><![CDATA[<p>In this episode of the DuckDB in Research series, we’re joined by Arjen de Vries, Professor of Data Science at Radboud University. Arjen dives into his team’s development of a DuckDB extension for FAISS, a library originally developed at Facebook for efficient similarity search and vector operations.</p><br><p>We explore the growing importance of embeddings and dense retrieval in modern information retrieval systems, and how DuckDB’s zero-copy architecture and tight integration with the Python ecosystem make it a compelling choice for managing large-scale vector data. Arjen shares insights into the technical challenges and architectural decisions behind the extension, comparisons with DuckDB’s native VSS (vector search) solution, and the broader vision of integrating vector search more deeply into relational databases.</p><br><p>Along the way, we also touch on DuckDB's extension ecosystem, its potential for future research, and why tools like this are reshaping how we build and query modern AI-enabled systems.</p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode of the DuckDB in Research series, we’re joined by Arjen de Vries, Professor of Data Science at Radboud University. Arjen dives into his team’s development of a DuckDB extension for FAISS, a library originally developed at Facebook for efficient similarity search and vector operations.</p><br><p>We explore the growing importance of embeddings and dense retrieval in modern information retrieval systems, and how DuckDB’s zero-copy architecture and tight integration with the Python ecosystem make it a compelling choice for managing large-scale vector data. Arjen shares insights into the technical challenges and architectural decisions behind the extension, comparisons with DuckDB’s native VSS (vector search) solution, and the broader vision of integrating vector search more deeply into relational databases.</p><br><p>Along the way, we also touch on DuckDB's extension ecosystem, its potential for future research, and why tools like this are reshaping how we build and query modern AI-enabled systems.</p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>David Justen | POLAR: Adaptive and non-invasive join order selection via plans of least resistance</title>
			<itunes:title>David Justen | POLAR: Adaptive and non-invasive join order selection via plans of least resistance</itunes:title>
			<pubDate>Thu, 03 Apr 2025 09:00:00 GMT</pubDate>
			<itunes:duration>51:08</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/67c56f31c6cef89b7d602607/media.mp3" length="49098880" type="audio/mpeg"/>
			<guid isPermaLink="false">67c56f31c6cef89b7d602607</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.vldb.org/pvldb/vol17/p1350-justen.pdf</link>
			<acast:episodeId>67c56f31c6cef89b7d602607</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>david-justen-polar</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4LvczlB4NWyw7yjtvSSoPXoKF71byNb0V1WRgPgMgG3OFimCkJQWC5SmMfboBLHim8UQ0ChwalsTSCzYS0KMyp/]]></acast:settings>
			<itunes:subtitle>DuckDB in Research</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>10</itunes:season>
			<itunes:episode>4</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1741340013808-b42a3f2b-f36f-4e05-a718-ccf8bf7639b1.jpeg"/>
			<description><![CDATA[<p>In this episode, we sit down with David Justen to discuss his work on POLAR: Adaptive and Non-invasive Join Order Selection via Plans of Least Resistance which was implemented in DuckDB. David shares his journey in the database space, insights into performance optimization, and the challenges of working with modern analytical workloads. We dive into the intricacies of query compilation, vectorized execution, and how DuckDB is shaping the future of in-memory databases. Tune in for a deep dive into database internals, industry trends, and what’s next for high-performance data processing!</p><br><p>Links: </p><ul><li><a href="https://www.vldb.org/pvldb/vol17/p1350-justen.pdf" rel="noopener noreferrer" target="_blank">VLDB 2024 Paper</a></li><li><a href="https://d-justen.github.io/" rel="noopener noreferrer" target="_blank">David's Homepage</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode, we sit down with David Justen to discuss his work on POLAR: Adaptive and Non-invasive Join Order Selection via Plans of Least Resistance which was implemented in DuckDB. David shares his journey in the database space, insights into performance optimization, and the challenges of working with modern analytical workloads. We dive into the intricacies of query compilation, vectorized execution, and how DuckDB is shaping the future of in-memory databases. Tune in for a deep dive into database internals, industry trends, and what’s next for high-performance data processing!</p><br><p>Links: </p><ul><li><a href="https://www.vldb.org/pvldb/vol17/p1350-justen.pdf" rel="noopener noreferrer" target="_blank">VLDB 2024 Paper</a></li><li><a href="https://d-justen.github.io/" rel="noopener noreferrer" target="_blank">David's Homepage</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Daniël ten Wolde | DuckPGQ: A graph extension supporting SQL/PGQ</title>
			<itunes:title>Daniël ten Wolde | DuckPGQ: A graph extension supporting SQL/PGQ</itunes:title>
			<pubDate>Thu, 20 Mar 2025 11:40:00 GMT</pubDate>
			<itunes:duration>48:38</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/67c56e54c6cef89b7d5ffb38/media.mp3" length="46694528" type="audio/mpeg"/>
			<guid isPermaLink="false">67c56e54c6cef89b7d5ffb38</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/daniel-ten-wolde-duckpg</link>
			<acast:episodeId>67c56e54c6cef89b7d5ffb38</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>daniel-ten-wolde-duckpg</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4J6Pkq0D/+cfOwWyTLdNXpoNAEG7Y/pXKk7RzpVo/UudIUXAys2gIsmyBDkhhQrl9R/HRXV66qMQPjB7/ppaoKm]]></acast:settings>
			<itunes:subtitle>DuckDB in Research</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>10</itunes:season>
			<itunes:episode>3</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1741340295949-f62a4c6a-3fa5-4159-825a-941228e5269d.jpeg"/>
			<description><![CDATA[<p>In this episode, we sit down with Daniël ten Wolde, a PhD researcher at CWI’s Database Architectures Group, to explore DuckPGQ—an extension to DuckDB that brings powerful graph querying capabilities to relational databases. Daniel shares his journey into database research, the motivations behind DuckPGQ, and how it simplifies working with graph data. We also dive into the technical challenges of implementing SQL Property Graph Queries (SQL PGQ) in DuckDB, discuss performance benchmarks, and explore the future of DuckPGQ in graph analytics and machine learning. Tune in to learn how this cutting-edge extension is bridging the gap between research and industry!</p><br><p>Links:</p><ul><li><a href="https://duckpgq.org/" rel="noopener noreferrer" target="_blank">DuckPGQ</a> homepage</li><li><a href="https://duckdb.org/community_extensions/extensions/duckpgq.html" rel="noopener noreferrer" target="_blank">Community extension</a></li><li><a href="https://www.cwi.nl/en/people/daniel-ten-wolde/" rel="noopener noreferrer" target="_blank">Daniel's homepage</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode, we sit down with Daniël ten Wolde, a PhD researcher at CWI’s Database Architectures Group, to explore DuckPGQ—an extension to DuckDB that brings powerful graph querying capabilities to relational databases. Daniel shares his journey into database research, the motivations behind DuckPGQ, and how it simplifies working with graph data. We also dive into the technical challenges of implementing SQL Property Graph Queries (SQL PGQ) in DuckDB, discuss performance benchmarks, and explore the future of DuckPGQ in graph analytics and machine learning. Tune in to learn how this cutting-edge extension is bridging the gap between research and industry!</p><br><p>Links:</p><ul><li><a href="https://duckpgq.org/" rel="noopener noreferrer" target="_blank">DuckPGQ</a> homepage</li><li><a href="https://duckdb.org/community_extensions/extensions/duckpgq.html" rel="noopener noreferrer" target="_blank">Community extension</a></li><li><a href="https://www.cwi.nl/en/people/daniel-ten-wolde/" rel="noopener noreferrer" target="_blank">Daniel's homepage</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Till Döhmen | DuckDQ: A Python library for data quality checks in ML pipelines</title>
			<itunes:title>Till Döhmen | DuckDQ: A Python library for data quality checks in ML pipelines</itunes:title>
			<pubDate>Thu, 13 Mar 2025 10:30:00 GMT</pubDate>
			<itunes:duration>58:12</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/67c56ecda02912270ddc16eb/media.mp3" length="55883904" type="audio/mpeg"/>
			<guid isPermaLink="false">67c56ecda02912270ddc16eb</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/till-dohmen-duckdq</link>
			<acast:episodeId>67c56ecda02912270ddc16eb</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>till-dohmen-duckdq</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4JJ4q7VssSKJiVnbq+4Y3Mm1lLaqcHvB17grISHWT8qoVys31xpgVkheoXPK5ZBXgWJsP2trr4/PuVtsq2RKdNP]]></acast:settings>
			<itunes:subtitle>DuckDB in Research</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>10</itunes:season>
			<itunes:episode>2</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1741340399620-a0458676-e624-46df-b182-24e0cec9cbd7.jpeg"/>
			<description><![CDATA[<p>In this episode we kick off our <em>DuckDB in Research</em> series with Till Döhmen, a software engineer at MotherDuck, where he leads AI efforts. Till shares insights into <em>DuckDQ</em>, a Python library designed for efficient data quality validation in machine learning pipelines, leveraging DuckDB’s high-performance querying capabilities.</p><br><p>We discuss the challenges of ensuring data integrity in ML workflows, the inefficiencies of existing solutions, and how DuckDQ provides a lightweight, drop-in replacement that seamlessly integrates with scikit-learn. Till also reflects on his research journey, the impact of DuckDB’s optimizations, and the future potential of data quality tooling. Plus, we explore how AI tools like ChatGPT are reshaping research and productivity. Tune in for a deep dive into the intersection of databases, machine learning, and data validation!</p><br><p>Resources:</p><ul><li><a href="https://github.com/tdoehmen/duckdq" rel="noopener noreferrer" target="_blank">GitHub</a></li><li><a href="https://ssc.io/pdf/duckdq.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://dsdsd.da.cwi.nl/slides/dsdsd-duckdq.pdf" rel="noopener noreferrer" target="_blank">Slides</a></li><li><a href="https://tdoehmen.github.io/" rel="noopener noreferrer" target="_blank">Till's Homepage</a></li><li><a href="https://github.com/rustyconover/duckdb-datasketches" rel="noopener noreferrer" target="_blank">datasketches extension</a> (released by a DuckDB community member 2 weeks after we recorded!)</li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode we kick off our <em>DuckDB in Research</em> series with Till Döhmen, a software engineer at MotherDuck, where he leads AI efforts. Till shares insights into <em>DuckDQ</em>, a Python library designed for efficient data quality validation in machine learning pipelines, leveraging DuckDB’s high-performance querying capabilities.</p><br><p>We discuss the challenges of ensuring data integrity in ML workflows, the inefficiencies of existing solutions, and how DuckDQ provides a lightweight, drop-in replacement that seamlessly integrates with scikit-learn. Till also reflects on his research journey, the impact of DuckDB’s optimizations, and the future potential of data quality tooling. Plus, we explore how AI tools like ChatGPT are reshaping research and productivity. Tune in for a deep dive into the intersection of databases, machine learning, and data validation!</p><br><p>Resources:</p><ul><li><a href="https://github.com/tdoehmen/duckdq" rel="noopener noreferrer" target="_blank">GitHub</a></li><li><a href="https://ssc.io/pdf/duckdq.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://dsdsd.da.cwi.nl/slides/dsdsd-duckdq.pdf" rel="noopener noreferrer" target="_blank">Slides</a></li><li><a href="https://tdoehmen.github.io/" rel="noopener noreferrer" target="_blank">Till's Homepage</a></li><li><a href="https://github.com/rustyconover/duckdb-datasketches" rel="noopener noreferrer" target="_blank">datasketches extension</a> (released by a DuckDB community member 2 weeks after we recorded!)</li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Disseminate x DuckDB Coming Soon...</title>
			<itunes:title>Disseminate x DuckDB Coming Soon...</itunes:title>
			<pubDate>Thu, 06 Mar 2025 11:00:00 GMT</pubDate>
			<itunes:duration>2:40</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/67c75bd4ece4993ac7e08697/media.mp3" length="2570368" type="audio/mpeg"/>
			<guid isPermaLink="false">67c75bd4ece4993ac7e08697</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/disseminate-x-duckdb-coming-soon</link>
			<acast:episodeId>67c75bd4ece4993ac7e08697</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>disseminate-x-duckdb-coming-soon</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4JB00V++csQTmgt101nST37G+IVT8iSLziW2r1v85465Wxrt0yuVdtp/ZnxIT2UZr5M73Qhv0XFThE8PYV7EZ8S]]></acast:settings>
			<itunes:subtitle>DuckDB in Research</itunes:subtitle>
			<itunes:episodeType>trailer</itunes:episodeType>
			<itunes:season>10</itunes:season>
			<itunes:episode>1</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1741214078642-00bb0693-dc8d-44d5-90f9-2ca85fb980c1.jpeg"/>
			<description><![CDATA[<p>Hey folks! </p><br><p>We have been collaborating with everyone's favourite in-process SQL OLAP database management system <em>DuckDB</em> to bring you a new podcast series - the <strong>DuckDB in Research</strong><em> </em>series!</p><br><p>At <em>Disseminate</em> our mission is to bridge the gap between research and industry by exploring research that has a real-world impact. DuckDB embodies this synergy—decades of research underpin its design, and now it’s making waves in the research community as a platform for others to build on and this is what the series will focus on! </p><br><p>Join us as we kick off the series with:</p><p>📌 <strong>Daniel ten Wolde</strong> – DuckPGQ, a graph workload extension for DuckDB supporting SQL/PGQ</p><p>📌 <strong>David Justen</strong> – POLAR: Adaptive, non-invasive join order selection </p><p>📌 <strong>Till Döhmen</strong> – DuckDQ: A Python library for data quality checks in ML pipelines</p><p>📌 <strong>Arjen de Vries</strong> – FAISS extension for vector similarity search in DuckDB</p><p>📌 <strong>Harry Gavriilidis</strong> – SheetReader: Efficient spreadsheet parsing</p><br><p>Whether you're a researcher, engineer, or just curious about the intersection of databases and innovation we are sure you will love this series. </p><br><p>Subscribe now and stay tuned for our first episode! 🚀</p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>Hey folks! </p><br><p>We have been collaborating with everyone's favourite in-process SQL OLAP database management system <em>DuckDB</em> to bring you a new podcast series - the <strong>DuckDB in Research</strong><em> </em>series!</p><br><p>At <em>Disseminate</em> our mission is to bridge the gap between research and industry by exploring research that has a real-world impact. DuckDB embodies this synergy—decades of research underpin its design, and now it’s making waves in the research community as a platform for others to build on and this is what the series will focus on! </p><br><p>Join us as we kick off the series with:</p><p>📌 <strong>Daniel ten Wolde</strong> – DuckPGQ, a graph workload extension for DuckDB supporting SQL/PGQ</p><p>📌 <strong>David Justen</strong> – POLAR: Adaptive, non-invasive join order selection </p><p>📌 <strong>Till Döhmen</strong> – DuckDQ: A Python library for data quality checks in ML pipelines</p><p>📌 <strong>Arjen de Vries</strong> – FAISS extension for vector similarity search in DuckDB</p><p>📌 <strong>Harry Gavriilidis</strong> – SheetReader: Efficient spreadsheet parsing</p><br><p>Whether you're a researcher, engineer, or just curious about the intersection of databases and innovation we are sure you will love this series. </p><br><p>Subscribe now and stay tuned for our first episode! 🚀</p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>High Impact in Databases with... Anastasia Ailamaki</title>
			<itunes:title>High Impact in Databases with... Anastasia Ailamaki</itunes:title>
			<pubDate>Mon, 03 Mar 2025 08:53:02 GMT</pubDate>
			<itunes:duration>46:17</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/67c56dee22548f88880ae18c/media.mp3" length="44441728" type="audio/mpeg"/>
			<guid isPermaLink="false">67c56dee22548f88880ae18c</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://people.epfl.ch/anastasia.ailamaki/?lang=en</link>
			<acast:episodeId>67c56dee22548f88880ae18c</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>high-impact-in-databases-with-anastasia-ailamaki</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4LoKb1ekR/a2VSQVafohdqdbJ+RNuxL6OsrEwRok5z3cQB3sVZOdWoeIqi6HItpfQ9CbUfAQHeOjlfIaSQSUzk8]]></acast:settings>
			<itunes:subtitle>High Impact</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>7</itunes:season>
			<itunes:episode>10</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1740991615779-1820eb0a-23a3-4a81-a1c0-acf34c5553f6.jpeg"/>
			<description><![CDATA[<p>In this High Impact in Databases episode we talk to <a href="https://people.epfl.ch/anastasia.ailamaki/?lang=en" rel="noopener noreferrer" target="_blank">Anastasia Ailamaki</a>.</p><br><p>Anastasia is a Professor of Computer and Communication Sciences at the École Polytechnique Fédérale de Lausanne (EPFL). Tune in to hear Anastasia's story! </p><br><p>The podcast is proudly sponsored by <a href="https://www.pometry.com/" rel="noopener noreferrer" target="_blank">Pometry</a> the developers behind <a href="https://www.raphtory.com/" rel="noopener noreferrer" target="_blank">Raphtory</a>, the open source temporal graph analytics engine for Python and Rust.</p><br><p>You can find Anastasia on:</p><ul><li><a href="https://people.epfl.ch/anastasia.ailamaki/?lang=en" rel="noopener noreferrer" target="_blank">Homepage</a></li><li><a href="https://scholar.google.com/citations?user=80pKMyMAAAAJ&amp;hl=en" rel="noopener noreferrer" target="_blank">Google Scholar</a></li><li><a href="https://www.linkedin.com/in/natassa/" rel="noopener noreferrer" target="_blank">LinkedIn</a></li></ul><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this High Impact in Databases episode we talk to <a href="https://people.epfl.ch/anastasia.ailamaki/?lang=en" rel="noopener noreferrer" target="_blank">Anastasia Ailamaki</a>.</p><br><p>Anastasia is a Professor of Computer and Communication Sciences at the École Polytechnique Fédérale de Lausanne (EPFL). Tune in to hear Anastasia's story! </p><br><p>The podcast is proudly sponsored by <a href="https://www.pometry.com/" rel="noopener noreferrer" target="_blank">Pometry</a> the developers behind <a href="https://www.raphtory.com/" rel="noopener noreferrer" target="_blank">Raphtory</a>, the open source temporal graph analytics engine for Python and Rust.</p><br><p>You can find Anastasia on:</p><ul><li><a href="https://people.epfl.ch/anastasia.ailamaki/?lang=en" rel="noopener noreferrer" target="_blank">Homepage</a></li><li><a href="https://scholar.google.com/citations?user=80pKMyMAAAAJ&amp;hl=en" rel="noopener noreferrer" target="_blank">Google Scholar</a></li><li><a href="https://www.linkedin.com/in/natassa/" rel="noopener noreferrer" target="_blank">LinkedIn</a></li></ul><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Anastasiia Kozar | Fault Tolerance Placement in the Internet of Things | #61</title>
			<itunes:title>Anastasiia Kozar | Fault Tolerance Placement in the Internet of Things | #61</itunes:title>
			<pubDate>Mon, 16 Dec 2024 08:11:58 GMT</pubDate>
			<itunes:duration>49:02</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/675f0042a89833ab777050a6/media.mp3" length="47077504" type="audio/mpeg"/>
			<guid isPermaLink="false">675f0042a89833ab777050a6</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://dl.acm.org/doi/pdf/10.1145/3654941</link>
			<acast:episodeId>675f0042a89833ab777050a6</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>anastasiia-kozar-fault-tolerance-placement</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4JJP3F9B/cox6J+BdYXm6+pta0B7ihThNHUcK/e3MYNjVspv8alzN1YzaOu9kCUTNkT9CnsPtELI+3KyHI+Gzi/]]></acast:settings>
			<itunes:subtitle>Cutting Edge</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>21</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1734276499687-78974953-aca3-4b8f-b716-1593736db7a8.jpeg"/>
			<description><![CDATA[<p>In this episode, we chat with Anastasiia Kozar about her research on fault tolerance in resource-constrained environments. As IoT applications leverage sensors, edge devices, and cloud infrastructure, ensuring system reliability at the edge poses unique challenges. Unlike the cloud, edge devices operate without persistent backups or high availability standards, leading to increased vulnerability to failures. Anastasiia explains how traditional methods fall short, as they fail to align resource allocation with fault tolerance needs, often resulting in system underperformance.</p><br><p>To address this, Anastasiia introduces a novel resource-aware approach that combines operator placement and fault tolerance into a unified process. By optimizing where and how data is backed up, her solution significantly improves system reliability, especially for low-end edge devices with limited resources. The result? Up to a tenfold increase in throughput compared to existing methods. Tune to learn more! </p><br><p>Links:</p><ul><li><a href="https://dl.acm.org/doi/pdf/10.1145/3654941" rel="noopener noreferrer" target="_blank">Fault Tolerance Placement in the Internet of Things [SIGMOD'24]</a></li><li><a href="https://nebula.stream/paper/zeuch_cidr20.pdf" rel="noopener noreferrer" target="_blank">The NebulaStream Platform: Data and Application Management for the Internet of Things [CIDR'20]</a></li><li><a href="https://nebula.stream/" rel="noopener noreferrer" target="_blank">nebula.stream</a></li></ul><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode, we chat with Anastasiia Kozar about her research on fault tolerance in resource-constrained environments. As IoT applications leverage sensors, edge devices, and cloud infrastructure, ensuring system reliability at the edge poses unique challenges. Unlike the cloud, edge devices operate without persistent backups or high availability standards, leading to increased vulnerability to failures. Anastasiia explains how traditional methods fall short, as they fail to align resource allocation with fault tolerance needs, often resulting in system underperformance.</p><br><p>To address this, Anastasiia introduces a novel resource-aware approach that combines operator placement and fault tolerance into a unified process. By optimizing where and how data is backed up, her solution significantly improves system reliability, especially for low-end edge devices with limited resources. The result? Up to a tenfold increase in throughput compared to existing methods. Tune to learn more! </p><br><p>Links:</p><ul><li><a href="https://dl.acm.org/doi/pdf/10.1145/3654941" rel="noopener noreferrer" target="_blank">Fault Tolerance Placement in the Internet of Things [SIGMOD'24]</a></li><li><a href="https://nebula.stream/paper/zeuch_cidr20.pdf" rel="noopener noreferrer" target="_blank">The NebulaStream Platform: Data and Application Management for the Internet of Things [CIDR'20]</a></li><li><a href="https://nebula.stream/" rel="noopener noreferrer" target="_blank">nebula.stream</a></li></ul><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Liana Patel | ACORN: Performant and Predicate-Agnostic Hybrid Search | #60</title>
			<itunes:title>Liana Patel | ACORN: Performant and Predicate-Agnostic Hybrid Search | #60</itunes:title>
			<pubDate>Mon, 11 Nov 2024 08:27:05 GMT</pubDate>
			<itunes:duration>52:49</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/672e3be1a42e23dc4bb65f23/media.mp3" length="50712704" type="audio/mpeg"/>
			<guid isPermaLink="false">672e3be1a42e23dc4bb65f23</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://arxiv.org/pdf/2403.04871</link>
			<acast:episodeId>672e3be1a42e23dc4bb65f23</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>liana-patel-acorn</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4LcesWMvmzbk9uZ8CorL0xb73Ycg337J+aenKreH81E5Q0TWyc+VeNCkC+xSlMG9dcVlFQQRNQJo2uXFMDT5Suy]]></acast:settings>
			<itunes:subtitle>Cutting Edge</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>20</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1731082591591-737db056-bd47-47dd-8b8f-5925c24ddac1.jpeg"/>
			<description><![CDATA[<p>In this episode, we chat with with Liana Patel to discuss ACORN, a groundbreaking method for hybrid search in applications using mixed-modality data. As more systems require simultaneous access to embedded images, text, video, and structured data, traditional search methods struggle to maintain efficiency and flexibility. Liana explains how ACORN, leveraging Hierarchical Navigable Small Worlds (HNSW), enables efficient, predicate-agnostic searches by introducing innovative predicate subgraph traversal. This allows ACORN to outperform existing methods significantly, supporting complex query semantics and achieving 2–1,000 times higher throughput on diverse datasets. Tune in to learn more!</p><br><p>Links:</p><ul><li><a href="https://arxiv.org/pdf/2403.04871" rel="noopener noreferrer" target="_blank">ACORN: Performant and Predicate-Agnostic Search Over Vector Embeddings and Structured Data</a> [SIGMOD'24]</li><li><a href="https://www.linkedin.com/in/liana-patel-b0a51316a/" rel="noopener noreferrer" target="_blank">Liana's LinkedIn</a></li><li><a href="https://x.com/lianapatel_" rel="noopener noreferrer" target="_blank">Liana's X</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode, we chat with with Liana Patel to discuss ACORN, a groundbreaking method for hybrid search in applications using mixed-modality data. As more systems require simultaneous access to embedded images, text, video, and structured data, traditional search methods struggle to maintain efficiency and flexibility. Liana explains how ACORN, leveraging Hierarchical Navigable Small Worlds (HNSW), enables efficient, predicate-agnostic searches by introducing innovative predicate subgraph traversal. This allows ACORN to outperform existing methods significantly, supporting complex query semantics and achieving 2–1,000 times higher throughput on diverse datasets. Tune in to learn more!</p><br><p>Links:</p><ul><li><a href="https://arxiv.org/pdf/2403.04871" rel="noopener noreferrer" target="_blank">ACORN: Performant and Predicate-Agnostic Search Over Vector Embeddings and Structured Data</a> [SIGMOD'24]</li><li><a href="https://www.linkedin.com/in/liana-patel-b0a51316a/" rel="noopener noreferrer" target="_blank">Liana's LinkedIn</a></li><li><a href="https://x.com/lianapatel_" rel="noopener noreferrer" target="_blank">Liana's X</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>High Impact in Databases with... David Maier</title>
			<itunes:title>High Impact in Databases with... David Maier</itunes:title>
			<pubDate>Mon, 04 Nov 2024 08:02:20 GMT</pubDate>
			<itunes:duration>1:02:24</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/67083460011dc7d6440e3102/media.mp3" length="59914368" type="audio/mpeg"/>
			<guid isPermaLink="false">67083460011dc7d6440e3102</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/high-impact-in-databases-with-david-maier</link>
			<acast:episodeId>67083460011dc7d6440e3102</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>high-impact-in-databases-with-david-maier</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4Itrh/+fE8EyqB3N1/wClMYVBRdINpKgUYEMJLZjADrZGRgHYYzVTl6d0Glr9zhaIjgCtIDUD53PmC0vVCCNmaS]]></acast:settings>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>7</itunes:season>
			<itunes:episode>9</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1728590871735-3d508329-a1c8-4b72-9aa4-3b8a9ca2bf84.jpeg"/>
			<description><![CDATA[<p>In this High Impact episode we talk to <a href="https://web.cecs.pdx.edu/~maier/" rel="noopener noreferrer" target="_blank">David Maier</a>.</p><br><p>David is the Maseeh Professor Emeritus of Emerging Technologies at Portland State University. Tune in to hear David's story and learn about some of his most impactful work.</p><br><p>The podcast is proudly sponsored by <a href="https://www.pometry.com/" rel="noopener noreferrer" target="_blank">Pometry</a> the developers behind <a href="https://www.raphtory.com/" rel="noopener noreferrer" target="_blank">Raphtory</a>, the open source temporal graph analytics engine for Python and Rust.</p><br><p>You can find David on:</p><ul><li><a href="https://web.cecs.pdx.edu/~maier/" rel="noopener noreferrer" target="_blank">Homepage</a></li><li><a href="https://scholar.google.com/citations?user=80pKMyMAAAAJ&amp;hl=en" rel="noopener noreferrer" target="_blank">Google Scholar</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this High Impact episode we talk to <a href="https://web.cecs.pdx.edu/~maier/" rel="noopener noreferrer" target="_blank">David Maier</a>.</p><br><p>David is the Maseeh Professor Emeritus of Emerging Technologies at Portland State University. Tune in to hear David's story and learn about some of his most impactful work.</p><br><p>The podcast is proudly sponsored by <a href="https://www.pometry.com/" rel="noopener noreferrer" target="_blank">Pometry</a> the developers behind <a href="https://www.raphtory.com/" rel="noopener noreferrer" target="_blank">Raphtory</a>, the open source temporal graph analytics engine for Python and Rust.</p><br><p>You can find David on:</p><ul><li><a href="https://web.cecs.pdx.edu/~maier/" rel="noopener noreferrer" target="_blank">Homepage</a></li><li><a href="https://scholar.google.com/citations?user=80pKMyMAAAAJ&amp;hl=en" rel="noopener noreferrer" target="_blank">Google Scholar</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Raunak Shah | R2D2: Reducing Redundancy and Duplication in Data Lakes | #59</title>
			<itunes:title>Raunak Shah | R2D2: Reducing Redundancy and Duplication in Data Lakes | #59</itunes:title>
			<pubDate>Mon, 28 Oct 2024 08:20:11 GMT</pubDate>
			<itunes:duration>31:09</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/6704dd306f369dd035f019cb/media.mp3" length="29913216" type="audio/mpeg"/>
			<guid isPermaLink="false">6704dd306f369dd035f019cb</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://arxiv.org/pdf/2312.13427</link>
			<acast:episodeId>6704dd306f369dd035f019cb</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>raunak-shah-r2d2-reducing-redundancy-and-duplication-in-data</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4LkNIg7Aw3waGofet5o9ph+6m29itLdCh63gyKfHJn1fN/1ujSg6ywMQyeO2vUuc7r0vxQyBieIoY7OWZgcFcpF]]></acast:settings>
			<itunes:subtitle><![CDATA[SIGMOD'24]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>19</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1728371900259-06da49be-9f90-449b-92c1-7a21173f527a.jpeg"/>
			<description><![CDATA[<p>In this episode, Raunak Shah joins us to discuss the critical issue of data redundancy in enterprise data lakes, which can lead to soaring storage and maintenance costs. Raunak highlights how large-scale data environments, ranging from terabytes to petabytes, often contain duplicate and redundant datasets that are difficult to manage. He introduces the concept of "dataset containment" and explains its significance in identifying and reducing redundancy at the table level in these massive data lakes—an area where there has been little prior work.</p><br><p>Raunak then dives into the details of R2D2, a novel three-step hierarchical pipeline designed to efficiently tackle dataset containment. By utilizing schema containment graphs, statistical min-max pruning, and content-level pruning, R2D2 progressively reduces the search space to pinpoint redundant data. Raunak also discusses how the system, implemented on platforms like Azure Databricks and AWS, offers significant improvements over existing methods, processing TB-scale data lakes in just a few hours with high accuracy. He concludes with a discussion on how R2D2 optimally balances storage savings and performance by identifying datasets that can be deleted and reconstructed on demand, providing valuable insights for enterprises aiming to streamline their data management strategies.</p><br><p>Materials:</p><ul><li><a href="https://arxiv.org/pdf/2312.13427" rel="noopener noreferrer" target="_blank">SIGMOD'24 Paper - R2D2: Reducing Redundancy and Duplication in Data Lakes</a></li><li><a href="https://arxiv.org/pdf/2305.14818" rel="noopener noreferrer" target="_blank">ICDE'24 - Towards Optimizing Storage Costs in the Cloud</a> </li></ul><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode, Raunak Shah joins us to discuss the critical issue of data redundancy in enterprise data lakes, which can lead to soaring storage and maintenance costs. Raunak highlights how large-scale data environments, ranging from terabytes to petabytes, often contain duplicate and redundant datasets that are difficult to manage. He introduces the concept of "dataset containment" and explains its significance in identifying and reducing redundancy at the table level in these massive data lakes—an area where there has been little prior work.</p><br><p>Raunak then dives into the details of R2D2, a novel three-step hierarchical pipeline designed to efficiently tackle dataset containment. By utilizing schema containment graphs, statistical min-max pruning, and content-level pruning, R2D2 progressively reduces the search space to pinpoint redundant data. Raunak also discusses how the system, implemented on platforms like Azure Databricks and AWS, offers significant improvements over existing methods, processing TB-scale data lakes in just a few hours with high accuracy. He concludes with a discussion on how R2D2 optimally balances storage savings and performance by identifying datasets that can be deleted and reconstructed on demand, providing valuable insights for enterprises aiming to streamline their data management strategies.</p><br><p>Materials:</p><ul><li><a href="https://arxiv.org/pdf/2312.13427" rel="noopener noreferrer" target="_blank">SIGMOD'24 Paper - R2D2: Reducing Redundancy and Duplication in Data Lakes</a></li><li><a href="https://arxiv.org/pdf/2305.14818" rel="noopener noreferrer" target="_blank">ICDE'24 - Towards Optimizing Storage Costs in the Cloud</a> </li></ul><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>High Impact in Databases with... Aditya Parameswaran</title>
			<itunes:title>High Impact in Databases with... Aditya Parameswaran</itunes:title>
			<pubDate>Mon, 21 Oct 2024 07:02:49 GMT</pubDate>
			<itunes:duration>58:57</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/670833d3f7743dfe52f1eea8/media.mp3" length="56602752" type="audio/mpeg"/>
			<guid isPermaLink="false">670833d3f7743dfe52f1eea8</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/high-impact-in-databases-with-aditya-parameswaran</link>
			<acast:episodeId>670833d3f7743dfe52f1eea8</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>high-impact-in-databases-with-aditya-parameswaran</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4IL0yqrM7AIKAPwdIeKn7mxj7k+iPIz0+YOfVhjGHHnjAPwWYpjPu0A7QW1UUBtxSclMEY1ByHHHyw1Ojg0yzNQ]]></acast:settings>
			<itunes:subtitle>High Impact</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>7</itunes:season>
			<itunes:episode>8</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1728590673249-278811c8-f8d0-4b3c-b9a8-e65de4648679.jpeg"/>
			<description><![CDATA[<p>In this High Impact episode we talk to <a href="https://people.eecs.berkeley.edu/~adityagp/" rel="noopener noreferrer" target="_blank">Aditya Parameswaran</a> about his some of his most impactful work.</p><br><p>Aditya is an Associate Professor at the University of California, Berkeley. Tune in to hear Aditya's story! </p><br><p>The podcast is proudly sponsored by <a href="https://www.pometry.com/" rel="noopener noreferrer" target="_blank">Pometry</a> the developers behind <a href="https://www.raphtory.com/" rel="noopener noreferrer" target="_blank">Raphtory</a>, the open source temporal graph analytics engine for Python and Rust.</p><br><p>Links:</p><ul><li><a href="https://epic.berkeley.edu/" rel="noopener noreferrer" target="_blank">EPIC Data Lab</a></li><li><a href="https://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper19.pdf" rel="noopener noreferrer" target="_blank">Answering Queries using Humans, Algorithms and Databases</a> (CIDR'11)</li><li><a href="https://control.cs.berkeley.edu/pwheel-vldb.pdf" rel="noopener noreferrer" target="_blank">Potter’s Wheel: An Interactive Data Cleaning System</a> (VLDB'01)</li><li><a href="https://control.cs.berkeley.edu/online/online.pdf" rel="noopener noreferrer" target="_blank">Online Aggregation</a> (SIGMOD'97)</li><li><a href="https://graphics.stanford.edu/papers/polaris/polaris.pdf" rel="noopener noreferrer" target="_blank">Polaris: A System for Query, Analysis and Visualization of Multi-dimensional Relational Databases</a>&nbsp;(INFOVIS'00)</li><li><a href="https://www.loom.com/share/89bfb10668d94595b265a156126474a5" rel="noopener noreferrer" target="_blank">Coping with Rejection</a> </li><li><a href="https://ponder.io/company/" rel="noopener noreferrer" target="_blank">Ponder</a></li></ul><p><br></p><p>You can find Aditya on:</p><ul><li><a href="https://x.com/adityagp" rel="noopener noreferrer" target="_blank">Twitter</a></li><li><a href="https://www.linkedin.com/in/aditya-parameswaran-0714b63/" rel="noopener noreferrer" target="_blank">LinkedIn</a></li><li><a href="https://scholar.google.com/citations?user=VeB3UbcAAAAJ&amp;hl=en&amp;oi=ao" rel="noopener noreferrer" target="_blank">Google Scholar</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this High Impact episode we talk to <a href="https://people.eecs.berkeley.edu/~adityagp/" rel="noopener noreferrer" target="_blank">Aditya Parameswaran</a> about his some of his most impactful work.</p><br><p>Aditya is an Associate Professor at the University of California, Berkeley. Tune in to hear Aditya's story! </p><br><p>The podcast is proudly sponsored by <a href="https://www.pometry.com/" rel="noopener noreferrer" target="_blank">Pometry</a> the developers behind <a href="https://www.raphtory.com/" rel="noopener noreferrer" target="_blank">Raphtory</a>, the open source temporal graph analytics engine for Python and Rust.</p><br><p>Links:</p><ul><li><a href="https://epic.berkeley.edu/" rel="noopener noreferrer" target="_blank">EPIC Data Lab</a></li><li><a href="https://www.cidrdb.org/cidr2011/Papers/CIDR11_Paper19.pdf" rel="noopener noreferrer" target="_blank">Answering Queries using Humans, Algorithms and Databases</a> (CIDR'11)</li><li><a href="https://control.cs.berkeley.edu/pwheel-vldb.pdf" rel="noopener noreferrer" target="_blank">Potter’s Wheel: An Interactive Data Cleaning System</a> (VLDB'01)</li><li><a href="https://control.cs.berkeley.edu/online/online.pdf" rel="noopener noreferrer" target="_blank">Online Aggregation</a> (SIGMOD'97)</li><li><a href="https://graphics.stanford.edu/papers/polaris/polaris.pdf" rel="noopener noreferrer" target="_blank">Polaris: A System for Query, Analysis and Visualization of Multi-dimensional Relational Databases</a>&nbsp;(INFOVIS'00)</li><li><a href="https://www.loom.com/share/89bfb10668d94595b265a156126474a5" rel="noopener noreferrer" target="_blank">Coping with Rejection</a> </li><li><a href="https://ponder.io/company/" rel="noopener noreferrer" target="_blank">Ponder</a></li></ul><p><br></p><p>You can find Aditya on:</p><ul><li><a href="https://x.com/adityagp" rel="noopener noreferrer" target="_blank">Twitter</a></li><li><a href="https://www.linkedin.com/in/aditya-parameswaran-0714b63/" rel="noopener noreferrer" target="_blank">LinkedIn</a></li><li><a href="https://scholar.google.com/citations?user=VeB3UbcAAAAJ&amp;hl=en&amp;oi=ao" rel="noopener noreferrer" target="_blank">Google Scholar</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Marco Costa | Taming Adversarial Queries with Optimal Range Filters | #58</title>
			<itunes:title>Marco Costa | Taming Adversarial Queries with Optimal Range Filters | #58</itunes:title>
			<pubDate>Mon, 14 Oct 2024 07:16:32 GMT</pubDate>
			<itunes:duration>37:07</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/6704dc561a3de581c69f66c3/media.mp3" length="35647616" type="audio/mpeg"/>
			<guid isPermaLink="false">6704dc561a3de581c69f66c3</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://arxiv.org/pdf/2311.15380</link>
			<acast:episodeId>6704dc561a3de581c69f66c3</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>marco-costa-taming-adversarial-queries-with-optimal-range-fi</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4Jsoz86uy7YG7RdU6pElkPYqO+8lsRdeZYs4zKlTlzqHLXs4zOPYHTh/QcHHeXcr6JDDS+bdpareAIJ4ZUqh53y]]></acast:settings>
			<itunes:subtitle><![CDATA[SIGMOD'24]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>18</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1728371137422-65610fb6-8226-4e26-b671-a6c288003e02.jpeg"/>
			<description><![CDATA[<p>In this episode, we sit down with <a href="https://www.linkedin.com/in/marcocosta97/" rel="noopener noreferrer" target="_blank">Marco Costa</a> to discuss the fascinating world of range filters, focusing on how they help optimize queries in databases by determining whether a range intersects with a given set of keys. Marco explains how traditional range filters, like Bloom filters, often result in high false positives and slow query times, especially when dealing with adversarial inputs where queries are correlated with the keys. He walks us through the limitations of existing heuristic-based solutions and the common challenges they face in maintaining accuracy and speed under such conditions.</p><br><p>The highlight of our conversation is Grafite, a novel range filter introduced by Marco and his team. Unlike previous approaches, Grafite comes with clear theoretical guarantees and offers robust performance across various datasets, query sizes, and workloads. Marco dives into the technicalities, explaining how Grafite delivers faster query times and maintains predictable false positive rates, making it the most reliable range filter in scenarios where queries are correlated with keys. Additionally, he introduces a simple heuristic filter that excels in uncorrelated queries, pushing the boundaries of current solutions in the field.</p><br><p><a href="https://arxiv.org/pdf/2311.15380" rel="noopener noreferrer" target="_blank">SIGMOD' 24 Paper - Grafite: Taming Adversarial Queries with Optimal Range Filters</a></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode, we sit down with <a href="https://www.linkedin.com/in/marcocosta97/" rel="noopener noreferrer" target="_blank">Marco Costa</a> to discuss the fascinating world of range filters, focusing on how they help optimize queries in databases by determining whether a range intersects with a given set of keys. Marco explains how traditional range filters, like Bloom filters, often result in high false positives and slow query times, especially when dealing with adversarial inputs where queries are correlated with the keys. He walks us through the limitations of existing heuristic-based solutions and the common challenges they face in maintaining accuracy and speed under such conditions.</p><br><p>The highlight of our conversation is Grafite, a novel range filter introduced by Marco and his team. Unlike previous approaches, Grafite comes with clear theoretical guarantees and offers robust performance across various datasets, query sizes, and workloads. Marco dives into the technicalities, explaining how Grafite delivers faster query times and maintains predictable false positive rates, making it the most reliable range filter in scenarios where queries are correlated with keys. Additionally, he introduces a simple heuristic filter that excels in uncorrelated queries, pushing the boundaries of current solutions in the field.</p><br><p><a href="https://arxiv.org/pdf/2311.15380" rel="noopener noreferrer" target="_blank">SIGMOD' 24 Paper - Grafite: Taming Adversarial Queries with Optimal Range Filters</a></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>High Impact in Databases with... Ali Dasdan</title>
			<itunes:title>High Impact in Databases with... Ali Dasdan</itunes:title>
			<pubDate>Tue, 08 Oct 2024 07:05:45 GMT</pubDate>
			<itunes:duration>1:03:02</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/670395d36f369dd0359ff627/media.mp3" length="60522624" type="audio/mpeg"/>
			<guid isPermaLink="false">670395d36f369dd0359ff627</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/high-impact-in-databases-with-ali-dasdan</link>
			<acast:episodeId>670395d36f369dd0359ff627</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>high-impact-in-databases-with-ali-dasdan</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4KIqSXlGljTGSIvLzYhv6yEfePjxqKnF0gFNZG0gfKYxFZ/oMweODh/APoFFdXIb7zVok75Irf7er14+GLhyI3s]]></acast:settings>
			<itunes:subtitle>High Impact</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>7</itunes:season>
			<itunes:episode>7</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1728235678907-e31e07c2-d94e-45f2-88d8-419f0887b6d5.jpeg"/>
			<description><![CDATA[<p>In this High Impact episode we talk to Ali Dasdan, CTO at <a href="https://www.zoominfo.com/" rel="noopener noreferrer" target="_blank">Zoominfo</a>. Tune in to hear Ali's story and learn about some of his most impactful work such as his work on "Map-Reduce-Merge".</p><br><p>The podcast is proudly sponsored by <a href="https://www.pometry.com/" rel="noopener noreferrer" target="_blank" style="background-color: rgb(255, 255, 255);">Pometry</a> the developers behind <a href="https://www.raphtory.com/" rel="noopener noreferrer" target="_blank" style="background-color: rgb(255, 255, 255);">Raphtory</a>, the open source temporal graph analytics engine for Python and Rust.</p><br><p>Materials mentioned on this episode:</p><ul><li><a href="https://citeseerx.ist.psu.edu/document?repid=rep1&amp;type=pdf&amp;doi=e3a2abce2230a9b2f9ff5af3669612103a576d6d" rel="noopener noreferrer" target="_blank">Map-Reduce-Merge: Simplified Relational Data Processing on Large Clusters</a> (SIGMOD'07)</li><li><a href="https://worrydream.com/refs/Hamming_1997_-_The_Art_of_Doing_Science_and_Engineering.pdf" rel="noopener noreferrer" target="_blank">The Art of Doing Science and Engineering: Learning to Learn</a>, Richard Hamming</li><li><a href="https://www.amazon.co.uk/How-Solve-Mathematical-Penguin-Science/dp/0140124993" rel="noopener noreferrer" target="_blank">How to Solve It</a>, George Polya</li><li><a href="https://www.amazon.com/Systems-Architecting-Creating-Building-Complex/dp/0138803455" rel="noopener noreferrer" target="_blank">Systems Architecting: Creating &amp; Building Complex Systems</a>, Eberhardt Rechtin</li></ul><p><br></p><p>You can find Ali on:</p><ul><li><a href="https://x.com/alidasdan" rel="noopener noreferrer" target="_blank">Twitter</a></li><li><a href="https://www.linkedin.com/in/dasdan/" rel="noopener noreferrer" target="_blank">LinkedIn </a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this High Impact episode we talk to Ali Dasdan, CTO at <a href="https://www.zoominfo.com/" rel="noopener noreferrer" target="_blank">Zoominfo</a>. Tune in to hear Ali's story and learn about some of his most impactful work such as his work on "Map-Reduce-Merge".</p><br><p>The podcast is proudly sponsored by <a href="https://www.pometry.com/" rel="noopener noreferrer" target="_blank" style="background-color: rgb(255, 255, 255);">Pometry</a> the developers behind <a href="https://www.raphtory.com/" rel="noopener noreferrer" target="_blank" style="background-color: rgb(255, 255, 255);">Raphtory</a>, the open source temporal graph analytics engine for Python and Rust.</p><br><p>Materials mentioned on this episode:</p><ul><li><a href="https://citeseerx.ist.psu.edu/document?repid=rep1&amp;type=pdf&amp;doi=e3a2abce2230a9b2f9ff5af3669612103a576d6d" rel="noopener noreferrer" target="_blank">Map-Reduce-Merge: Simplified Relational Data Processing on Large Clusters</a> (SIGMOD'07)</li><li><a href="https://worrydream.com/refs/Hamming_1997_-_The_Art_of_Doing_Science_and_Engineering.pdf" rel="noopener noreferrer" target="_blank">The Art of Doing Science and Engineering: Learning to Learn</a>, Richard Hamming</li><li><a href="https://www.amazon.co.uk/How-Solve-Mathematical-Penguin-Science/dp/0140124993" rel="noopener noreferrer" target="_blank">How to Solve It</a>, George Polya</li><li><a href="https://www.amazon.com/Systems-Architecting-Creating-Building-Complex/dp/0138803455" rel="noopener noreferrer" target="_blank">Systems Architecting: Creating &amp; Building Complex Systems</a>, Eberhardt Rechtin</li></ul><p><br></p><p>You can find Ali on:</p><ul><li><a href="https://x.com/alidasdan" rel="noopener noreferrer" target="_blank">Twitter</a></li><li><a href="https://www.linkedin.com/in/dasdan/" rel="noopener noreferrer" target="_blank">LinkedIn </a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Matt Perron | Analytical Workload Cost and Performance Stability With Elastic Pools | #57</title>
			<itunes:title>Matt Perron | Analytical Workload Cost and Performance Stability With Elastic Pools | #57</itunes:title>
			<pubDate>Mon, 22 Jul 2024 06:24:20 GMT</pubDate>
			<itunes:duration>52:10</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/6692ec4e7a7bbfd5f8645bec/media.mp3" length="50086016" type="audio/mpeg"/>
			<guid isPermaLink="false">6692ec4e7a7bbfd5f8645bec</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://matthew-perron.com/assets/pdf/cackle.pdf</link>
			<acast:episodeId>6692ec4e7a7bbfd5f8645bec</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>matt-perron-cackle</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4KK0wEpgkA4cTfrUjom6vHqGx+/NFUzxphpCdPvvZ7d8o+yNEDGleXQdZy7lxKtPTXMCWkj0jr39k3Sd9fk8apO]]></acast:settings>
			<itunes:subtitle>Cutting Edge</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>17</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1720904753499-030b740bc60f1deab0ed6fb47b09eba5.jpeg"/>
			<description><![CDATA[<p>In this episode, we dive deep into the complexities of managing analytical query workloads with our guest, Matt Perron. Matt explains how the rapid and unpredictable fluctuations in resource demands present a significant challenge for provisioning. Traditional methods often lead to either over-provisioning, resulting in excessive costs, or under-provisioning, which causes poor query latency during demand spikes. However, there's a promising solution on the horizon. Matt shares insights from recent research that showcases the viability of using cloud functions to dynamically match compute supply with workload demand without the need for prior resource provisioning. While effective for low query volumes, this approach becomes cost-prohibitive as query volumes increase, highlighting the need for a more balanced strategy.</p><br><p>Matt introduces us to a novel strategy that combines the best of both worlds: the rapid scalability of cloud functions and the cost-effectiveness of virtual machines. This innovative approach leverages the fast but expensive cloud functions alongside slow-starting yet inexpensive virtual machines to provide elasticity without sacrificing cost efficiency. He elaborates on how their implementation, called Cackle, achieves consistent performance and cost savings across a wide range of workloads and conditions. Tune in to learn how Cackle avoids the pitfalls of traditional approaches, delivering stable query performance and minimizing costs even as demand fluctuates wildly.</p><br><p>Links:</p><ul><li><a href="https://matthew-perron.com/assets/pdf/cackle.pdf" rel="noopener noreferrer" target="_blank">Cackle: Analytical Workload Cost and Performance Stability With Elastic Pools [SIGMOD'24]</a></li><li><a href="https://matthew-perron.com/" rel="noopener noreferrer" target="_blank">Matt's Homepage</a></li></ul><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode, we dive deep into the complexities of managing analytical query workloads with our guest, Matt Perron. Matt explains how the rapid and unpredictable fluctuations in resource demands present a significant challenge for provisioning. Traditional methods often lead to either over-provisioning, resulting in excessive costs, or under-provisioning, which causes poor query latency during demand spikes. However, there's a promising solution on the horizon. Matt shares insights from recent research that showcases the viability of using cloud functions to dynamically match compute supply with workload demand without the need for prior resource provisioning. While effective for low query volumes, this approach becomes cost-prohibitive as query volumes increase, highlighting the need for a more balanced strategy.</p><br><p>Matt introduces us to a novel strategy that combines the best of both worlds: the rapid scalability of cloud functions and the cost-effectiveness of virtual machines. This innovative approach leverages the fast but expensive cloud functions alongside slow-starting yet inexpensive virtual machines to provide elasticity without sacrificing cost efficiency. He elaborates on how their implementation, called Cackle, achieves consistent performance and cost savings across a wide range of workloads and conditions. Tune in to learn how Cackle avoids the pitfalls of traditional approaches, delivering stable query performance and minimizing costs even as demand fluctuates wildly.</p><br><p>Links:</p><ul><li><a href="https://matthew-perron.com/assets/pdf/cackle.pdf" rel="noopener noreferrer" target="_blank">Cackle: Analytical Workload Cost and Performance Stability With Elastic Pools [SIGMOD'24]</a></li><li><a href="https://matthew-perron.com/" rel="noopener noreferrer" target="_blank">Matt's Homepage</a></li></ul><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>High Impact in Databases with... Andreas Kipf</title>
			<itunes:title>High Impact in Databases with... Andreas Kipf</itunes:title>
			<pubDate>Mon, 15 Jul 2024 07:30:07 GMT</pubDate>
			<itunes:duration>53:06</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/669273947d0f33cb7412bade/media.mp3" length="50985088" type="audio/mpeg"/>
			<guid isPermaLink="false">669273947d0f33cb7412bade</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.utn.de/en/departments/department-engineering/data-systems-lab/</link>
			<acast:episodeId>669273947d0f33cb7412bade</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>high-impact-in-databases-with-andreas-kipf</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4KdHffykOTeBO7TQx2b4L9eA5mQZiXr4LfVw/qnL7TIw2xDJLtcMvtbLwHmeNXpQBA6tckcb4IHoGwIYUDNI5Ti]]></acast:settings>
			<itunes:subtitle>High Impact</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>7</itunes:season>
			<itunes:episode>6</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1720873177220-c29722b32316650baa21423c53a281ba.jpeg"/>
			<description><![CDATA[<p>In this High Impact episode we talk to <a href="https://www.utn.de/en/departments/department-engineering/data-systems-lab/" rel="noopener noreferrer" target="_blank">Andreas Kipf</a> about his work on "Learned Cardinalities". </p><br><p>Andreas is the Professor of Data Systems at Technische Universität Nürnberg (UTN). Tune in to hear Andreas's story and learn about some of his most impactful work.</p><br><p>The podcast is proudly sponsored by <a href="https://www.pometry.com/" rel="noopener noreferrer" target="_blank">Pometry</a> the developers behind <a href="https://www.raphtory.com/" rel="noopener noreferrer" target="_blank">Raphtory</a>, the open source temporal graph analytics engine for Python and Rust.</p><br><p>Papers mentioned on this episode:</p><ul><li><a href="Learned Cardinalities: Estimating Correlated Joins with Deep Learning" rel="noopener noreferrer" target="_blank">Learned Cardinalities: Estimating Correlated Joins with Deep Learning CIDR'19</a></li><li><a href="https://www.cl.cam.ac.uk/~ey204/teaching/ACS/R244_2018_2019/papers/Kraska_SIGMOD_2018.pdf" rel="noopener noreferrer" target="_blank">The Case for Learned Index Structures SIGMOD'18</a></li><li><a href="https://db.in.tum.de/~radke/papers/hugejoins.pdf" rel="noopener noreferrer" target="_blank">Adaptive Optimization of Very Large Join Queries SIGMOD'18</a></li></ul><p><br></p><p>You can find Andreas on:</p><ul><li><a href="https://x.com/andreaskipf" rel="noopener noreferrer" target="_blank">Twitter</a></li><li><a href="https://www.linkedin.com/in/andreaskipf/" rel="noopener noreferrer" target="_blank">LinkedIn </a></li><li><a href="https://scholar.google.com/citations?user=y3IdRusAAAAJ&amp;hl=en" rel="noopener noreferrer" target="_blank">Google Scholar</a></li><li><a href="https://www.utn.de/en/departments/department-engineering/data-systems-lab/" rel="noopener noreferrer" target="_blank">Data Systems Lab @ UTN</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this High Impact episode we talk to <a href="https://www.utn.de/en/departments/department-engineering/data-systems-lab/" rel="noopener noreferrer" target="_blank">Andreas Kipf</a> about his work on "Learned Cardinalities". </p><br><p>Andreas is the Professor of Data Systems at Technische Universität Nürnberg (UTN). Tune in to hear Andreas's story and learn about some of his most impactful work.</p><br><p>The podcast is proudly sponsored by <a href="https://www.pometry.com/" rel="noopener noreferrer" target="_blank">Pometry</a> the developers behind <a href="https://www.raphtory.com/" rel="noopener noreferrer" target="_blank">Raphtory</a>, the open source temporal graph analytics engine for Python and Rust.</p><br><p>Papers mentioned on this episode:</p><ul><li><a href="Learned Cardinalities: Estimating Correlated Joins with Deep Learning" rel="noopener noreferrer" target="_blank">Learned Cardinalities: Estimating Correlated Joins with Deep Learning CIDR'19</a></li><li><a href="https://www.cl.cam.ac.uk/~ey204/teaching/ACS/R244_2018_2019/papers/Kraska_SIGMOD_2018.pdf" rel="noopener noreferrer" target="_blank">The Case for Learned Index Structures SIGMOD'18</a></li><li><a href="https://db.in.tum.de/~radke/papers/hugejoins.pdf" rel="noopener noreferrer" target="_blank">Adaptive Optimization of Very Large Join Queries SIGMOD'18</a></li></ul><p><br></p><p>You can find Andreas on:</p><ul><li><a href="https://x.com/andreaskipf" rel="noopener noreferrer" target="_blank">Twitter</a></li><li><a href="https://www.linkedin.com/in/andreaskipf/" rel="noopener noreferrer" target="_blank">LinkedIn </a></li><li><a href="https://scholar.google.com/citations?user=y3IdRusAAAAJ&amp;hl=en" rel="noopener noreferrer" target="_blank">Google Scholar</a></li><li><a href="https://www.utn.de/en/departments/department-engineering/data-systems-lab/" rel="noopener noreferrer" target="_blank">Data Systems Lab @ UTN</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title><![CDATA[Marvin Wyrich & Justus Bogner | How Software Engineering Research Is Discussed on LinkedIn | #56]]></title>
			<itunes:title><![CDATA[Marvin Wyrich & Justus Bogner | How Software Engineering Research Is Discussed on LinkedIn | #56]]></itunes:title>
			<pubDate>Mon, 08 Jul 2024 07:13:17 GMT</pubDate>
			<itunes:duration>47:53</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/668ac8b319ce290b0bdb27a3/media.mp3" length="45973632" type="audio/mpeg"/>
			<guid isPermaLink="false">668ac8b319ce290b0bdb27a3</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://arxiv.org/pdf/2401.02268</link>
			<acast:episodeId>668ac8b319ce290b0bdb27a3</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>marvin-wyrich-justus-bogner</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4IZLPGrN+ZLotJY9f0a0DeLJHFUCujE/YWpQweg2GIVFFvqQR1rQdS43eRQh0Vc06Zf+bLmCXE52RYPfDO00FR4]]></acast:settings>
			<itunes:subtitle>Cutting Edge</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>16</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1720371142635-f9c202231adaaa581971a82ce645fdfe.jpeg"/>
			<description><![CDATA[<p>In this episode, we delve into the intersection of software engineering (SE) research and professional practice with experts Marvin Wyrich and Justus Bogner. As LinkedIn stands as the largest professional network globally, it serves as a critical platform for bridging the gap between SE researchers and practitioners. Marvin and Justus explore the dynamics of how research findings are shared and discussed on LinkedIn, providing both quantitative and qualitative insights into the effectiveness of these interactions. They reveal that a significant portion of SE research posts on LinkedIn are authored by individuals outside the original research team and that a majority of comments on these posts come from industry professionals, highlighting a vibrant but underutilized avenue for science communication.</p><br><p>Our guests shed light on the current state of this metaphorical bridge, emphasizing the potential for LinkedIn to enhance collaboration and knowledge exchange between academia and industry. Despite the promising engagement from practitioners, the discussion reveals that only half of the SE research posts receive any comments, indicating room for improvement in fostering more interactive dialogues. Marvin and Justus offer practical advice for researchers to better engage with practitioners on LinkedIn and suggest strategies for making research dissemination more impactful. This episode provides valuable insights for anyone interested in leveraging social media for advancing software engineering knowledge and practice.</p><p><br></p><h3>Links:</h3><ul><li><a href="https://arxiv.org/pdf/2401.02268" rel="noopener noreferrer" target="_blank">ICSE'24 Paper</a></li><li><a href="https://marvin-wyrich.de/" rel="noopener noreferrer" target="_blank">Marvin's Homepage</a></li><li><a href="https://xjreb.github.io/" rel="noopener noreferrer" target="_blank">Justus's Homepage</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode, we delve into the intersection of software engineering (SE) research and professional practice with experts Marvin Wyrich and Justus Bogner. As LinkedIn stands as the largest professional network globally, it serves as a critical platform for bridging the gap between SE researchers and practitioners. Marvin and Justus explore the dynamics of how research findings are shared and discussed on LinkedIn, providing both quantitative and qualitative insights into the effectiveness of these interactions. They reveal that a significant portion of SE research posts on LinkedIn are authored by individuals outside the original research team and that a majority of comments on these posts come from industry professionals, highlighting a vibrant but underutilized avenue for science communication.</p><br><p>Our guests shed light on the current state of this metaphorical bridge, emphasizing the potential for LinkedIn to enhance collaboration and knowledge exchange between academia and industry. Despite the promising engagement from practitioners, the discussion reveals that only half of the SE research posts receive any comments, indicating room for improvement in fostering more interactive dialogues. Marvin and Justus offer practical advice for researchers to better engage with practitioners on LinkedIn and suggest strategies for making research dissemination more impactful. This episode provides valuable insights for anyone interested in leveraging social media for advancing software engineering knowledge and practice.</p><p><br></p><h3>Links:</h3><ul><li><a href="https://arxiv.org/pdf/2401.02268" rel="noopener noreferrer" target="_blank">ICSE'24 Paper</a></li><li><a href="https://marvin-wyrich.de/" rel="noopener noreferrer" target="_blank">Marvin's Homepage</a></li><li><a href="https://xjreb.github.io/" rel="noopener noreferrer" target="_blank">Justus's Homepage</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>High Impact in Databases with... Joe Hellerstein</title>
			<itunes:title>High Impact in Databases with... Joe Hellerstein</itunes:title>
			<pubDate>Mon, 01 Jul 2024 06:25:22 GMT</pubDate>
			<itunes:duration>52:56</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/667fba685cd64e8b86607c40/media.mp3" length="50827392" type="audio/mpeg"/>
			<guid isPermaLink="false">667fba685cd64e8b86607c40</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://dsf.berkeley.edu/jmh/bio.html</link>
			<acast:episodeId>667fba685cd64e8b86607c40</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>high-impact-in-databases-with-joe-hellerstein</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4JYIYgLTmVw7tHDxOlx9V93zgO0ulcufhEQV21jxObVikAOq3cBWaXQR0E28z4WTQ/s7MkDzm/j3Jg4XL4mGBf2]]></acast:settings>
			<itunes:subtitle>High Impact</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>7</itunes:season>
			<itunes:episode>5</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1719646698927-8831c56d1afbba75f5bc24730eec8c58.jpeg"/>
			<description><![CDATA[<p>In this High Impact episode we talk to <a href="https://dsf.berkeley.edu/jmh/bio.html" rel="noopener noreferrer" target="_blank">Joe Hellerstein</a>.</p><br><p>Joe is the Jim Gray Professor of Computer Science at UC Berkeley. Tune in to hear Joe's story and learn about some of his most impactful work.</p><br><p>The podcast is proudly sponsored by <a href="https://www.pometry.com/" rel="noopener noreferrer" target="_blank">Pometry</a> the developers behind <a href="https://www.raphtory.com/" rel="noopener noreferrer" target="_blank">Raphtory</a>, the open source temporal graph analytics engine for Python and Rust.</p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this High Impact episode we talk to <a href="https://dsf.berkeley.edu/jmh/bio.html" rel="noopener noreferrer" target="_blank">Joe Hellerstein</a>.</p><br><p>Joe is the Jim Gray Professor of Computer Science at UC Berkeley. Tune in to hear Joe's story and learn about some of his most impactful work.</p><br><p>The podcast is proudly sponsored by <a href="https://www.pometry.com/" rel="noopener noreferrer" target="_blank">Pometry</a> the developers behind <a href="https://www.raphtory.com/" rel="noopener noreferrer" target="_blank">Raphtory</a>, the open source temporal graph analytics engine for Python and Rust.</p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Harry Goldstein | Property-Based Testing | #55</title>
			<itunes:title>Harry Goldstein | Property-Based Testing | #55</itunes:title>
			<pubDate>Tue, 25 Jun 2024 16:37:35 GMT</pubDate>
			<itunes:duration>49:13</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/66785daf4297480012eaeb8e/media.mp3" length="47259776" type="audio/mpeg"/>
			<guid isPermaLink="false">66785daf4297480012eaeb8e</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://harrisongoldste.in/papers/icse24-pbt-in-practice.pdf</link>
			<acast:episodeId>66785daf4297480012eaeb8e</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>harry-goldstein-property-based-testing-55</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4KeJR/lKo4PtXxM0/QrAshvjrZctMTslCKISqWhll9yYMwkpITbujDS0Udvlhi6aXeZHqceH+qZGkh2OEEDmA7d]]></acast:settings>
			<itunes:subtitle>Cutting Edge</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>15</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1719164296949-b939b1b72590c48a268c8adff09dbe4f.jpeg"/>
			<description><![CDATA[<p>In this episode, we chat with Harry Goldstein about Property-Based Testing (PBT). Harry shares insights from interviews with PBT users at Jane Street, highlighting PBT's strengths in testing complex code and boosting developer confidence. Harry also discusses the challenges of writing properties and generating random data, and the difficulties in assessing test effectiveness. He identifies key areas for future improvement, such as performance enhancements and better random input generation. This episode is essential for those interested in the latest developments in software testing and PBT's future.</p><p><br></p><h3>Links:</h3><ul><li><a href="https://harrisongoldste.in/papers/icse24-pbt-in-practice.pdf" rel="noopener noreferrer" target="_blank">ICSE'24 Paper </a></li><li><a href="https://harrisongoldste.in/" rel="noopener noreferrer" target="_blank">Harry's website</a></li><li>X: <a href="https://twitter.com/hgoldstein95" rel="noopener noreferrer" target="_blank">@hgoldstein95</a> </li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode, we chat with Harry Goldstein about Property-Based Testing (PBT). Harry shares insights from interviews with PBT users at Jane Street, highlighting PBT's strengths in testing complex code and boosting developer confidence. Harry also discusses the challenges of writing properties and generating random data, and the difficulties in assessing test effectiveness. He identifies key areas for future improvement, such as performance enhancements and better random input generation. This episode is essential for those interested in the latest developments in software testing and PBT's future.</p><p><br></p><h3>Links:</h3><ul><li><a href="https://harrisongoldste.in/papers/icse24-pbt-in-practice.pdf" rel="noopener noreferrer" target="_blank">ICSE'24 Paper </a></li><li><a href="https://harrisongoldste.in/" rel="noopener noreferrer" target="_blank">Harry's website</a></li><li>X: <a href="https://twitter.com/hgoldstein95" rel="noopener noreferrer" target="_blank">@hgoldstein95</a> </li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>High Impact in Databases with... Raghu Ramakrishnan</title>
			<itunes:title>High Impact in Databases with... Raghu Ramakrishnan</itunes:title>
			<pubDate>Mon, 17 Jun 2024 07:15:18 GMT</pubDate>
			<itunes:duration>23:56</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/6637aafeb7ee620013d65a2f/media.mp3" length="22982784" type="audio/mpeg"/>
			<guid isPermaLink="false">6637aafeb7ee620013d65a2f</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/raghu-ramakrishnan</link>
			<acast:episodeId>6637aafeb7ee620013d65a2f</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>raghu-ramakrishnan</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4J8PYZ1B166eWj376dCXhsLdu1rBoQik8GrQUAozWn7/JfCKiusMru4pHfQS9ocK8kDlq7/Ix1uuJ4zjoKAILdj]]></acast:settings>
			<itunes:subtitle>High Impact</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>7</itunes:season>
			<itunes:episode>4</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1714924153888-4cc1328eb87fe91a0c690ebf239a0da6.jpeg"/>
			<description><![CDATA[<p>In this High Impact episode we talk to <a href="https://www.microsoft.com/en-us/research/people/raghu/" rel="noopener noreferrer" target="_blank">Raghu Ramakrishnan</a>.</p><br><p>Raghu is CTO for Data and a Technical Fellow at Microsoft.&nbsp;Tune in to hear Raghu's story and learn about some of his most impactful work.</p><br><p>The podcast is proudly sponsored by <a href="https://www.pometry.com/" rel="noopener noreferrer" target="_blank">Pometry</a> the developers behind <a href="https://www.raphtory.com/" rel="noopener noreferrer" target="_blank">Raphtory</a>, the open source temporal graph analytics engine for Python and Rust.</p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this High Impact episode we talk to <a href="https://www.microsoft.com/en-us/research/people/raghu/" rel="noopener noreferrer" target="_blank">Raghu Ramakrishnan</a>.</p><br><p>Raghu is CTO for Data and a Technical Fellow at Microsoft.&nbsp;Tune in to hear Raghu's story and learn about some of his most impactful work.</p><br><p>The podcast is proudly sponsored by <a href="https://www.pometry.com/" rel="noopener noreferrer" target="_blank">Pometry</a> the developers behind <a href="https://www.raphtory.com/" rel="noopener noreferrer" target="_blank">Raphtory</a>, the open source temporal graph analytics engine for Python and Rust.</p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Gina Yuan | In-Network Assistance With Sidekick Protocols | #54</title>
			<itunes:title>Gina Yuan | In-Network Assistance With Sidekick Protocols | #54</itunes:title>
			<pubDate>Mon, 10 Jun 2024 06:30:00 GMT</pubDate>
			<itunes:duration>55:25</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/665f4ce4e177f10012438a7d/media.mp3" length="53206841" type="audio/mpeg"/>
			<guid isPermaLink="false">665f4ce4e177f10012438a7d</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/gina-yuan-sidekick-protocols-54</link>
			<acast:episodeId>665f4ce4e177f10012438a7d</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>gina-yuan-sidekick-protocols-54</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4ImRTGP0p147y6b2REAnyqIaLwEydKsvWuDoqV2NDeoqFSSxPC2d2PD5kmsxFQ1V+YafNcCbcz6G/uxzSnWVOoh]]></acast:settings>
			<itunes:subtitle>Cutting Edge</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>14</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1717868296545-3c26fd83d5248d274fe9a1e9693e28ae.jpeg"/>
			<description><![CDATA[<p>Join us as we chat with Gina Yuan about her pioneering work on sidekick protocols, designed to enhance the performance of encrypted transport protocols like QUIC and WebRTC. These protocols ensure privacy but limit in-network innovations. Gina explains how sidekick protocols allow intermediaries to assist endpoints without compromising encryption.</p><br><p>Discover how Gina tackles the challenge of referencing opaque packets with her innovative quACK tool and learn about the real-world benefits, including improved Wi-Fi retransmissions, energy-saving proxy acknowledgments, and the PACUBIC congestion-control mechanism. This episode offers a glimpse into the future of network performance and security.</p><p><br></p><h3>Links:</h3><ul><li><a href="https://ginayuan.com/papers/nsdi24-sidekick.pdf" rel="noopener noreferrer" target="_blank">NSDI'2024 Paper</a></li><li><a href="https://ginayuan.com/" rel="noopener noreferrer" target="_blank">Gina's Homepage</a></li><li><a href="https://github.com/ygina/sidekick" rel="noopener noreferrer" target="_blank">Sidekick's Github Repo</a></li></ul><p><br></p><br><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>Join us as we chat with Gina Yuan about her pioneering work on sidekick protocols, designed to enhance the performance of encrypted transport protocols like QUIC and WebRTC. These protocols ensure privacy but limit in-network innovations. Gina explains how sidekick protocols allow intermediaries to assist endpoints without compromising encryption.</p><br><p>Discover how Gina tackles the challenge of referencing opaque packets with her innovative quACK tool and learn about the real-world benefits, including improved Wi-Fi retransmissions, energy-saving proxy acknowledgments, and the PACUBIC congestion-control mechanism. This episode offers a glimpse into the future of network performance and security.</p><p><br></p><h3>Links:</h3><ul><li><a href="https://ginayuan.com/papers/nsdi24-sidekick.pdf" rel="noopener noreferrer" target="_blank">NSDI'2024 Paper</a></li><li><a href="https://ginayuan.com/" rel="noopener noreferrer" target="_blank">Gina's Homepage</a></li><li><a href="https://github.com/ygina/sidekick" rel="noopener noreferrer" target="_blank">Sidekick's Github Repo</a></li></ul><p><br></p><br><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>High Impact in Databases with... Moshe Vardi</title>
			<itunes:title>High Impact in Databases with... Moshe Vardi</itunes:title>
			<pubDate>Mon, 03 Jun 2024 06:30:03 GMT</pubDate>
			<itunes:duration>47:39</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/665721d20abe7000128931e3/media.mp3" length="45755046" type="audio/mpeg"/>
			<guid isPermaLink="false">665721d20abe7000128931e3</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/high-impact-in-databases-with-moshe-vardi</link>
			<acast:episodeId>665721d20abe7000128931e3</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>high-impact-in-databases-with-moshe-vardi</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4LOyW3foTM1q9Fk3kJ2EJnGzD8syskFOb7LAkFpHjSzlqgQJ0fDlKcUdFKPBZXST7Tg2ymq767bw4RHElZrTrt3]]></acast:settings>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>7</itunes:season>
			<itunes:episode>3</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1716986312344-d38cd5f907b5567b666f022f8c2703d4.jpeg"/>
			<description><![CDATA[<p>Welcome to another episode of the High Impact series - today we talk with Moshe Vardi! </p><br><p>Moshe is the Karen George Distinguished Service Professor in Computational Engineering at Rice University where his research focuses on automated reasoning. Tune in to hear Moshe's story and learn about some of his most impactful work.</p><br><p>The podcast is proudly sponsored by <a href="https://www.pometry.com/" rel="noopener noreferrer" target="_blank">Pometry</a> the developers behind <a href="https://www.raphtory.com/" rel="noopener noreferrer" target="_blank">Raphtory</a>, the open source temporal graph analytics engine for Python and Rust.</p><br><p>You can find Moshe on X, LinkedIn, and Mastadon @vardi. Links to all his work can be found on his website <a href="https://www.cs.rice.edu/~vardi/" rel="noopener noreferrer" target="_blank">here</a>. </p><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>Welcome to another episode of the High Impact series - today we talk with Moshe Vardi! </p><br><p>Moshe is the Karen George Distinguished Service Professor in Computational Engineering at Rice University where his research focuses on automated reasoning. Tune in to hear Moshe's story and learn about some of his most impactful work.</p><br><p>The podcast is proudly sponsored by <a href="https://www.pometry.com/" rel="noopener noreferrer" target="_blank">Pometry</a> the developers behind <a href="https://www.raphtory.com/" rel="noopener noreferrer" target="_blank">Raphtory</a>, the open source temporal graph analytics engine for Python and Rust.</p><br><p>You can find Moshe on X, LinkedIn, and Mastadon @vardi. Links to all his work can be found on his website <a href="https://www.cs.rice.edu/~vardi/" rel="noopener noreferrer" target="_blank">here</a>. </p><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Tammy Sukprasert | Move Your Workloads To Sweden! | #53</title>
			<itunes:title>Tammy Sukprasert | Move Your Workloads To Sweden! | #53</itunes:title>
			<pubDate>Mon, 27 May 2024 06:20:14 GMT</pubDate>
			<itunes:duration>32:50</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/66506576f749480012b943da/media.mp3" length="31531136" type="audio/mpeg"/>
			<guid isPermaLink="false">66506576f749480012b943da</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://arxiv.org/pdf/2306.06502</link>
			<acast:episodeId>66506576f749480012b943da</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>tammy-sukprasert-move-your-workloads-to-sweden</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4L3TFd2OjyF7tz9wyNOb5mTvvMRxeGe7VhI5paIoONFofXHCrgSlOEWfQtOsAXXQJLRgyO54BXHPqtOOAybYmhW]]></acast:settings>
			<itunes:subtitle>Cutting Edge</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>13</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1716544315565-5798e7fc07a56a2231a3dd1d9601351c.jpeg"/>
			<description><![CDATA[<p>In this episode, we dip our toes into the world of sustainable computing and interview Tammy Sukprasert about her research on reducing carbon emissions in cloud computing through workload scheduling. Tammy explores the concept of shifting cloud workloads across different times and locations to coincide with low-carbon energy availability. Unlike previous studies that focused on specific regions or workloads, her comprehensive analysis uses carbon intensity data from 123 regions to assess both batch and interactive workloads. She considers various factors such as job duration, deadlines, and service level objectives (SLOs). Tammy's findings reveal that while spatiotemporal workload shifting can reduce carbon emissions, the practical upper bounds of these reductions are limited and far from ideal. Simple scheduling policies often achieve most of the potential reductions, with more complex techniques offering minimal additional benefits.</p><br><p>Additionally, Tammy's research highlights that as the energy grid becomes greener, the benefits of carbon-aware scheduling over carbon-agnostic approaches decrease. This discussion offers crucial insights for the future of cloud computing and sustainable technology. Whether you're a tech enthusiast, environmental advocate, or cloud industry professional, Tammy's work provides valuable perspectives on the intersection of technology and sustainability. Join us to learn more about how innovative scheduling strategies can contribute to a greener cloud computing landscape.</p><p><br></p><h3>Links:</h3><ul><li><a href="https://www.linkedin.com/in/tsukprasert/" rel="noopener noreferrer" target="_blank">Tammy's LinkedIn</a></li><li><a href="https://arxiv.org/pdf/2306.06502" rel="noopener noreferrer" target="_blank">On the Limitations of Carbon-Aware Temporal and Spatial Workload Shifting in the Cloud</a> EuroSys'24 Paper </li><li><a href="https://github.com/umassos/decarbonization-potential" rel="noopener noreferrer" target="_blank">Carbon Savings Upper Bound Analysis</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode, we dip our toes into the world of sustainable computing and interview Tammy Sukprasert about her research on reducing carbon emissions in cloud computing through workload scheduling. Tammy explores the concept of shifting cloud workloads across different times and locations to coincide with low-carbon energy availability. Unlike previous studies that focused on specific regions or workloads, her comprehensive analysis uses carbon intensity data from 123 regions to assess both batch and interactive workloads. She considers various factors such as job duration, deadlines, and service level objectives (SLOs). Tammy's findings reveal that while spatiotemporal workload shifting can reduce carbon emissions, the practical upper bounds of these reductions are limited and far from ideal. Simple scheduling policies often achieve most of the potential reductions, with more complex techniques offering minimal additional benefits.</p><br><p>Additionally, Tammy's research highlights that as the energy grid becomes greener, the benefits of carbon-aware scheduling over carbon-agnostic approaches decrease. This discussion offers crucial insights for the future of cloud computing and sustainable technology. Whether you're a tech enthusiast, environmental advocate, or cloud industry professional, Tammy's work provides valuable perspectives on the intersection of technology and sustainability. Join us to learn more about how innovative scheduling strategies can contribute to a greener cloud computing landscape.</p><p><br></p><h3>Links:</h3><ul><li><a href="https://www.linkedin.com/in/tsukprasert/" rel="noopener noreferrer" target="_blank">Tammy's LinkedIn</a></li><li><a href="https://arxiv.org/pdf/2306.06502" rel="noopener noreferrer" target="_blank">On the Limitations of Carbon-Aware Temporal and Spatial Workload Shifting in the Cloud</a> EuroSys'24 Paper </li><li><a href="https://github.com/umassos/decarbonization-potential" rel="noopener noreferrer" target="_blank">Carbon Savings Upper Bound Analysis</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>High Impact in Databases with... Ryan Marcus</title>
			<itunes:title>High Impact in Databases with... Ryan Marcus</itunes:title>
			<pubDate>Mon, 20 May 2024 06:30:08 GMT</pubDate>
			<itunes:duration>59:52</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/664a58afd4e90300122472b2/media.mp3" length="57475200" type="audio/mpeg"/>
			<guid isPermaLink="false">664a58afd4e90300122472b2</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://rmarcus.info/blog/</link>
			<acast:episodeId>664a58afd4e90300122472b2</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>ryan-marcus</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4LE+ltUwWptPnautQ45x9bQq8UzlujIYPFIFAR+zLTDyDBCACfgUlXjymgisEpEi85aK7561sa80UeS/7/ma3ye]]></acast:settings>
			<itunes:subtitle>High Impact</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>7</itunes:season>
			<itunes:episode>2</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1716147671150-02b55a04c131b15abfd22ed65634e061.jpeg"/>
			<description><![CDATA[<p>Welcome to the first episode of the High Impact series!</p><br><p>The High Impact series is inspired by a blog post “<a href="https://rmarcus.info/blog/2023/07/25/papers.html" rel="noopener noreferrer" target="_blank">Most Influential Database Papers</a>" by <a href="https://rmarcus.info/blog/" rel="noopener noreferrer" target="_blank">Ryan Marcus</a> and today we talk to Ryan! Tune in to hear about Ryan's story so far. We chat about his current work before moving on to discuss his most impactful work. We also dig into what motivates him and how he handles setbacks, as well as getting his take on the current trends.</p><br><p>The podcast is proudly sponsored by <a href="https://www.pometry.com/" rel="noopener noreferrer" target="_blank">Pometry</a> the developers behind <a href="https://www.raphtory.com/" rel="noopener noreferrer" target="_blank">Raphtory</a>, the open source temporal graph analytics engine for Python and Rust.</p><br><p>Links:</p><ul><li><a href="https://rmarcus.info/blog/2023/07/25/papers.html" rel="noopener noreferrer" target="_blank">Most influential database papers</a></li><li><a href="https://rmarcus.info/blog/" rel="noopener noreferrer" target="_blank">Ryan's website</a></li><li><a href="https://x.com/RyanMarcus" rel="noopener noreferrer" target="_blank">Ryan's twitter/X</a></li><li><a href="https://files.zotero.net/eyJleHBpcmVzIjoxNzE2MTQ3OTkwLCJoYXNoIjoiNzViNGIyMTgwZDlmZThiODc2NjdmMWI5NTY4ZjgyOWYiLCJjb250ZW50VHlwZSI6ImFwcGxpY2F0aW9uXC9wZGYiLCJjaGFyc2V0IjoiIiwiZmlsZW5hbWUiOiJSeWFuIE1hcmN1cyBldCBhbC4gLSAyMDIyIC0gQmFvIE1ha2luZyBMZWFybmVkIFF1ZXJ5IE9wdGltaXphdGlvbiBQcmFjdGljYWwucGRmIn0%3D/37c008a36bd5da4acb097f72487ffc62f1924ee4e3eebc5f6061cac9a44ebf71/Ryan%20Marcus%20et%20al.%20-%202022%20-%20Bao%20Making%20Learned%20Query%20Optimization%20Practical.pdf" rel="noopener noreferrer" target="_blank">Bao: Making Learned Query Optimization Practical</a></li><li><a href="https://files.zotero.net/eyJleHBpcmVzIjoxNzE2MTQ4MDE1LCJoYXNoIjoiYWE3ZmRjMzQ2MjMxNjhhOWRjYWE4OTc1ZjkzMTY5NDkiLCJjb250ZW50VHlwZSI6ImFwcGxpY2F0aW9uXC9wZGYiLCJjaGFyc2V0IjoiIiwiZmlsZW5hbWUiOiJNYXJjdXMgZXQgYWwuIC0gMjAxOSAtIE5lbyBBIExlYXJuZWQgUXVlcnkgT3B0aW1pemVyLnBkZiJ9/2e69c23ae198e5e5271531f7a523c2128ed3f3e2db946bd70d1a0bd96d10f9d2/Marcus%20et%20al.%20-%202019%20-%20Neo%20A%20Learned%20Query%20Optimizer.pdf" rel="noopener noreferrer" target="_blank">Neo: A Learned Query Optimizer</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>Welcome to the first episode of the High Impact series!</p><br><p>The High Impact series is inspired by a blog post “<a href="https://rmarcus.info/blog/2023/07/25/papers.html" rel="noopener noreferrer" target="_blank">Most Influential Database Papers</a>" by <a href="https://rmarcus.info/blog/" rel="noopener noreferrer" target="_blank">Ryan Marcus</a> and today we talk to Ryan! Tune in to hear about Ryan's story so far. We chat about his current work before moving on to discuss his most impactful work. We also dig into what motivates him and how he handles setbacks, as well as getting his take on the current trends.</p><br><p>The podcast is proudly sponsored by <a href="https://www.pometry.com/" rel="noopener noreferrer" target="_blank">Pometry</a> the developers behind <a href="https://www.raphtory.com/" rel="noopener noreferrer" target="_blank">Raphtory</a>, the open source temporal graph analytics engine for Python and Rust.</p><br><p>Links:</p><ul><li><a href="https://rmarcus.info/blog/2023/07/25/papers.html" rel="noopener noreferrer" target="_blank">Most influential database papers</a></li><li><a href="https://rmarcus.info/blog/" rel="noopener noreferrer" target="_blank">Ryan's website</a></li><li><a href="https://x.com/RyanMarcus" rel="noopener noreferrer" target="_blank">Ryan's twitter/X</a></li><li><a href="https://files.zotero.net/eyJleHBpcmVzIjoxNzE2MTQ3OTkwLCJoYXNoIjoiNzViNGIyMTgwZDlmZThiODc2NjdmMWI5NTY4ZjgyOWYiLCJjb250ZW50VHlwZSI6ImFwcGxpY2F0aW9uXC9wZGYiLCJjaGFyc2V0IjoiIiwiZmlsZW5hbWUiOiJSeWFuIE1hcmN1cyBldCBhbC4gLSAyMDIyIC0gQmFvIE1ha2luZyBMZWFybmVkIFF1ZXJ5IE9wdGltaXphdGlvbiBQcmFjdGljYWwucGRmIn0%3D/37c008a36bd5da4acb097f72487ffc62f1924ee4e3eebc5f6061cac9a44ebf71/Ryan%20Marcus%20et%20al.%20-%202022%20-%20Bao%20Making%20Learned%20Query%20Optimization%20Practical.pdf" rel="noopener noreferrer" target="_blank">Bao: Making Learned Query Optimization Practical</a></li><li><a href="https://files.zotero.net/eyJleHBpcmVzIjoxNzE2MTQ4MDE1LCJoYXNoIjoiYWE3ZmRjMzQ2MjMxNjhhOWRjYWE4OTc1ZjkzMTY5NDkiLCJjb250ZW50VHlwZSI6ImFwcGxpY2F0aW9uXC9wZGYiLCJjaGFyc2V0IjoiIiwiZmlsZW5hbWUiOiJNYXJjdXMgZXQgYWwuIC0gMjAxOSAtIE5lbyBBIExlYXJuZWQgUXVlcnkgT3B0aW1pemVyLnBkZiJ9/2e69c23ae198e5e5271531f7a523c2128ed3f3e2db946bd70d1a0bd96d10f9d2/Marcus%20et%20al.%20-%202019%20-%20Neo%20A%20Learned%20Query%20Optimizer.pdf" rel="noopener noreferrer" target="_blank">Neo: A Learned Query Optimizer</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Yazhuo Zhang | SIEVE is Simpler than LRU | #52</title>
			<itunes:title>Yazhuo Zhang | SIEVE is Simpler than LRU | #52</itunes:title>
			<pubDate>Mon, 13 May 2024 06:25:04 GMT</pubDate>
			<itunes:duration>43:10</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/663bc295ead6590013841c6d/media.mp3" length="41441408" type="audio/mpeg"/>
			<guid isPermaLink="false">663bc295ead6590013841c6d</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://yazhuozhang.com/assets/publication/nsdi24-sieve.pdf</link>
			<acast:episodeId>663bc295ead6590013841c6d</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>yazhuo-zhang-sieve</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4IeRBnv/UEvyKYjXFPJKN2ogn5NrYXwb5TF6FMTAlmFxNvKi3n6hIP2gQZi50eb8OFAPx9WGFFEJZdyB3C+C6mZ]]></acast:settings>
			<itunes:subtitle><![CDATA[NSDI'24]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>12</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1715191872258-7495d223a93a2f3c0aebcf25e1138ddf.jpeg"/>
			<description><![CDATA[<p>In this episode, we explore the world of caching with Yazhuo Zhang, who introduces the game-changing SIEVE algorithm. Traditional eviction algorithms have long struggled with a trade-off between efficiency, throughput, and simplicity. However, SIEVE disrupts this balance by offering a simpler alternative to LRU while outperforming state-of-the-art algorithms in both efficiency and scalability for web cache workloads. Implemented in five production cache libraries with minimal code changes, SIEVE's superiority shines through in a comprehensive evaluation across 1559 cache traces. With up to a remarkable 63.2% lower miss ratio than ARC and surpassing nine other algorithms in over 45% of cases, SIEVE's simplicity doesn't compromise on scalability, doubling throughput compared to optimized LRU implementations. Join us as Yazhuo reveals how SIEVE is set to redefine caching efficiency, promising faster and more streamlined data serving in production systems.</p><p><br></p><h3>Links:</h3><ul><li><a href="https://dl.acm.org/doi/10.1145/3600006.3613147" rel="noopener noreferrer" target="_blank">SIEVE is Simpler than LRU: an Efficient Turn-Key Eviction Algorithm for Web Caches</a> (NSDI'24)</li><li><a href="https://dl.acm.org/doi/pdf/10.1145/3600006.3613147" rel="noopener noreferrer" target="_blank">FIFO Queues are All You Need for Cache Eviction</a> (SOSP'23)</li><li><a href="https://yazhuozhang.com/" rel="noopener noreferrer" target="_blank">Yazhuo's homepage</a></li><li><a href="https://www.linkedin.com/in/yazhuo-zhang-80833b15b/" rel="noopener noreferrer" target="_blank">Yazhuo's LinkedIn</a></li><li><a href="https://twitter.com/Yazhuo11" rel="noopener noreferrer" target="_blank">Yazhuo's Twitter/X</a></li><li><a href="https://cachemon.github.io/SIEVE-website/" rel="noopener noreferrer" target="_blank">Cachemon/SIEVE's website</a></li><li><a href="https://s3fifo.com/" rel="noopener noreferrer" target="_blank">S3FIFO website</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode, we explore the world of caching with Yazhuo Zhang, who introduces the game-changing SIEVE algorithm. Traditional eviction algorithms have long struggled with a trade-off between efficiency, throughput, and simplicity. However, SIEVE disrupts this balance by offering a simpler alternative to LRU while outperforming state-of-the-art algorithms in both efficiency and scalability for web cache workloads. Implemented in five production cache libraries with minimal code changes, SIEVE's superiority shines through in a comprehensive evaluation across 1559 cache traces. With up to a remarkable 63.2% lower miss ratio than ARC and surpassing nine other algorithms in over 45% of cases, SIEVE's simplicity doesn't compromise on scalability, doubling throughput compared to optimized LRU implementations. Join us as Yazhuo reveals how SIEVE is set to redefine caching efficiency, promising faster and more streamlined data serving in production systems.</p><p><br></p><h3>Links:</h3><ul><li><a href="https://dl.acm.org/doi/10.1145/3600006.3613147" rel="noopener noreferrer" target="_blank">SIEVE is Simpler than LRU: an Efficient Turn-Key Eviction Algorithm for Web Caches</a> (NSDI'24)</li><li><a href="https://dl.acm.org/doi/pdf/10.1145/3600006.3613147" rel="noopener noreferrer" target="_blank">FIFO Queues are All You Need for Cache Eviction</a> (SOSP'23)</li><li><a href="https://yazhuozhang.com/" rel="noopener noreferrer" target="_blank">Yazhuo's homepage</a></li><li><a href="https://www.linkedin.com/in/yazhuo-zhang-80833b15b/" rel="noopener noreferrer" target="_blank">Yazhuo's LinkedIn</a></li><li><a href="https://twitter.com/Yazhuo11" rel="noopener noreferrer" target="_blank">Yazhuo's Twitter/X</a></li><li><a href="https://cachemon.github.io/SIEVE-website/" rel="noopener noreferrer" target="_blank">Cachemon/SIEVE's website</a></li><li><a href="https://s3fifo.com/" rel="noopener noreferrer" target="_blank">S3FIFO website</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Introducing the High Impact Series...</title>
			<itunes:title>Introducing the High Impact Series...</itunes:title>
			<pubDate>Mon, 06 May 2024 07:06:26 GMT</pubDate>
			<itunes:duration>2:40</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/6632769b3eb92b0012acc029/media.mp3" length="2566272" type="audio/mpeg"/>
			<guid isPermaLink="false">6632769b3eb92b0012acc029</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/introducing-the-high-impact-series</link>
			<acast:episodeId>6632769b3eb92b0012acc029</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>introducing-the-high-impact-series</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4I4ATMUIL6dUVA8NwhTNcWqs0vuQRR7DrRodsJUwgAws8Vl/+vLj0iUXQ3xDGJQZJ7Sa4yvBZaEiGWacm71R0ZI]]></acast:settings>
			<itunes:subtitle>High Impact</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>7</itunes:season>
			<itunes:episode>1</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1714582094545-c0ec9536a00416ea244f88a1feb5ec36.jpeg"/>
			<description><![CDATA[<p>Introducing the High Impact Series! </p><br><p>Hey folks, we have a new series coming soon inspired by a blog post “<a href="https://rmarcus.info/blog/2023/07/25/papers.html" rel="noopener noreferrer" target="_blank">Most Influential Database Papers</a>" by <a href="https://rmarcus.info/blog/" rel="noopener noreferrer" target="_blank">Ryan Marcus</a>. The series will feature interviews with the authors of some of the most impactful work in the field of databases. We will talk about the story behind some of their most impactful work, getting them to reflect on the impact it has had over years, as well as getting their take on the current trends in the field. </p><br><p>Proudly sponsored by <a href="www.raphtory.com" rel="noopener noreferrer" target="_blank">Pometry</a></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>Introducing the High Impact Series! </p><br><p>Hey folks, we have a new series coming soon inspired by a blog post “<a href="https://rmarcus.info/blog/2023/07/25/papers.html" rel="noopener noreferrer" target="_blank">Most Influential Database Papers</a>" by <a href="https://rmarcus.info/blog/" rel="noopener noreferrer" target="_blank">Ryan Marcus</a>. The series will feature interviews with the authors of some of the most impactful work in the field of databases. We will talk about the story behind some of their most impactful work, getting them to reflect on the impact it has had over years, as well as getting their take on the current trends in the field. </p><br><p>Proudly sponsored by <a href="www.raphtory.com" rel="noopener noreferrer" target="_blank">Pometry</a></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Eleni Zapridou | Oligolithic Cross-task Optimizations across Isolated Workloads | #51</title>
			<itunes:title>Eleni Zapridou | Oligolithic Cross-task Optimizations across Isolated Workloads | #51</itunes:title>
			<pubDate>Mon, 29 Apr 2024 07:06:50 GMT</pubDate>
			<itunes:duration>38:42</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/662a9d6ca1c8cf001283b081/media.mp3" length="37171328" type="audio/mpeg"/>
			<guid isPermaLink="false">662a9d6ca1c8cf001283b081</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.cidrdb.org/cidr2024/papers/p31-zapridou.pdf</link>
			<acast:episodeId>662a9d6ca1c8cf001283b081</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>eleni-zapridou-oligolithic</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4IZZ1imqMfoQz1GsFOU48tdQQbIyQjOi3ZgiEdtMMpguPoeUd5lH7lCsZF+QKZ8I3f24Mja1dt13ERDfnxsrJ8x]]></acast:settings>
			<itunes:subtitle><![CDATA[CIDR'24]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>11</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1714068215663-9dbe9d75554f9eb99a6eacbf36321248.jpeg"/>
			<description><![CDATA[<p>In this episode, we talk to Eleni Zapridou and delve into the challenges of data processing within enterprises, where multiple applications operate concurrently on shared resources. Traditional resource boundaries between applications often lead to increased costs and resource consumption. However, as Eleni explains the principle of functional isolation offers a solution by combining cross-task optimizations with performance isolation. We explore GroupShare, an innovative strategy that reduces CPU consumption and query latency, transforming data processing efficiency. Join us as we discuss the implications of functional isolation with Eleni and its potential to revolutionize enterprise data processing.</p><p><br></p><h4>Links:</h4><ul><li><a href="https://www.cidrdb.org/cidr2024/papers/p31-zapridou.pdf" rel="noopener noreferrer" target="_blank">CIDR'24 Paper</a></li><li><a href="https://twitter.com/elenizapridou" rel="noopener noreferrer" target="_blank">Eleni's Twitter</a></li><li><a href="https://www.linkedin.com/in/eleni-zapridou-8103b6159/" rel="noopener noreferrer" target="_blank">Eleni's LinkedIn</a> </li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode, we talk to Eleni Zapridou and delve into the challenges of data processing within enterprises, where multiple applications operate concurrently on shared resources. Traditional resource boundaries between applications often lead to increased costs and resource consumption. However, as Eleni explains the principle of functional isolation offers a solution by combining cross-task optimizations with performance isolation. We explore GroupShare, an innovative strategy that reduces CPU consumption and query latency, transforming data processing efficiency. Join us as we discuss the implications of functional isolation with Eleni and its potential to revolutionize enterprise data processing.</p><p><br></p><h4>Links:</h4><ul><li><a href="https://www.cidrdb.org/cidr2024/papers/p31-zapridou.pdf" rel="noopener noreferrer" target="_blank">CIDR'24 Paper</a></li><li><a href="https://twitter.com/elenizapridou" rel="noopener noreferrer" target="_blank">Eleni's Twitter</a></li><li><a href="https://www.linkedin.com/in/eleni-zapridou-8103b6159/" rel="noopener noreferrer" target="_blank">Eleni's LinkedIn</a> </li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Pat Helland | Scalable OLTP in the Cloud: What’s the BIG DEAL? | #50</title>
			<itunes:title>Pat Helland | Scalable OLTP in the Cloud: What’s the BIG DEAL? | #50</itunes:title>
			<pubDate>Mon, 15 Apr 2024 07:55:44 GMT</pubDate>
			<itunes:duration>1:20:03</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/6612678362e1bc0016406f61/media.mp3" length="76855424" type="audio/mpeg"/>
			<guid isPermaLink="false">6612678362e1bc0016406f61</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.cidrdb.org/cidr2024/papers/p63-helland.pdf</link>
			<acast:episodeId>6612678362e1bc0016406f61</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>pat-helland-scalable-oltp-in-the-cloud</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4LPDkC+EWRQdY3xewxQ0Al273j+SLfSFmdKUST6h2pGbAsAqhShTvbjJl/5Ud/Ezle0WO2vE8Dg546L9UXJgS1L]]></acast:settings>
			<itunes:subtitle><![CDATA[CIDR'24]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>10</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1712481617254-13fe7b56e6ae88dd8e58c6359b30a559.jpeg"/>
			<description><![CDATA[<p>In this thought-provoking podcast episode, we dive into the world of scalable OLTP (OnLine Transaction Processing) systems with the insightful Pat Helland. As a seasoned expert in the field, Pat shares his insights on the critical role of isolation semantics in the scalability of OLTP systems, emphasizing its significance as the "BIG DEAL." By examining the interface between OLTP databases and applications, particularly through the lens of RCSI (READ COMMITTED SNAPSHOT ISOLATION) SQL databases, Pat talks about the limitations imposed by current database architectures and application patterns on scalability.</p><br><p>Through a compelling thought experiment, Pat explores the asymptotic limits to scale for OLTP systems, challenging the status quo and envisioning a reimagined approach to building both databases and applications that empowers scalability while adhering to established to RCSI. By shedding light on how today's popular databases and common app patterns may unnecessarily hinder scalability, Pat sparks discussions within the database community, paving the way for new opportunities and advancements in OLTP systems. Join us as we delve into this conversation with Pat Helland, where every insight shared could potentially catalyze significant transformations in the realm of OLTP scalability.</p><br><p>Papers mentioned during the episode:</p><ul><li><a href="https://www.cidrdb.org/cidr2024/papers/p63-helland.pdf" rel="noopener noreferrer" target="_blank">Scalable OLTP in the Cloud: What’s the BIG DEAL?</a></li><li><a href="https://dl.acm.org/doi/pdf/10.1145/3526208" rel="noopener noreferrer" target="_blank">Autonomous Computing</a></li><li><a href="https://www.cidrdb.org/cidr2022/papers/p5-helland.pdf" rel="noopener noreferrer" target="_blank">Decoupled Transactions</a></li><li><a href="https://dl.acm.org/doi/pdf/10.1145/3475965.3480470" rel="noopener noreferrer" target="_blank">Don't Get Stuck in the "Con" Game</a></li><li><a href="https://dl.acm.org/doi/pdf/10.1145/3386269" rel="noopener noreferrer" target="_blank">The Best Place to Build a Subway</a></li><li><a href="https://arxiv.org/pdf/0909.1788.pdf" rel="noopener noreferrer" target="_blank">Building on Quicksand</a></li><li><a href="https://dl.acm.org/doi/pdf/10.1145/3080010" rel="noopener noreferrer" target="_blank">Side effects, front and center</a></li><li><a href="https://dl.acm.org/doi/pdf/10.1145/2844112" rel="noopener noreferrer" target="_blank">Immutability changes everything</a></li><li><a href="https://www.cidrdb.org/cidr2023/papers/p50-ziegler.pdf" rel="noopener noreferrer" target="_blank">Is Scalable OLTP in the Cloud a solved problem?</a></li></ul><p><br></p><p>You can find Pat on:</p><ul><li><a href="https://x.com/PatHelland" rel="noopener noreferrer" target="_blank">Twitter/X</a></li><li><a href="https://www.linkedin.com/in/pathelland/" rel="noopener noreferrer" target="_blank">LinkedIn</a></li><li><a href="https://pathelland.substack.com/" rel="noopener noreferrer" target="_blank">Scattered Thoughts on Distributed Systems</a> </li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this thought-provoking podcast episode, we dive into the world of scalable OLTP (OnLine Transaction Processing) systems with the insightful Pat Helland. As a seasoned expert in the field, Pat shares his insights on the critical role of isolation semantics in the scalability of OLTP systems, emphasizing its significance as the "BIG DEAL." By examining the interface between OLTP databases and applications, particularly through the lens of RCSI (READ COMMITTED SNAPSHOT ISOLATION) SQL databases, Pat talks about the limitations imposed by current database architectures and application patterns on scalability.</p><br><p>Through a compelling thought experiment, Pat explores the asymptotic limits to scale for OLTP systems, challenging the status quo and envisioning a reimagined approach to building both databases and applications that empowers scalability while adhering to established to RCSI. By shedding light on how today's popular databases and common app patterns may unnecessarily hinder scalability, Pat sparks discussions within the database community, paving the way for new opportunities and advancements in OLTP systems. Join us as we delve into this conversation with Pat Helland, where every insight shared could potentially catalyze significant transformations in the realm of OLTP scalability.</p><br><p>Papers mentioned during the episode:</p><ul><li><a href="https://www.cidrdb.org/cidr2024/papers/p63-helland.pdf" rel="noopener noreferrer" target="_blank">Scalable OLTP in the Cloud: What’s the BIG DEAL?</a></li><li><a href="https://dl.acm.org/doi/pdf/10.1145/3526208" rel="noopener noreferrer" target="_blank">Autonomous Computing</a></li><li><a href="https://www.cidrdb.org/cidr2022/papers/p5-helland.pdf" rel="noopener noreferrer" target="_blank">Decoupled Transactions</a></li><li><a href="https://dl.acm.org/doi/pdf/10.1145/3475965.3480470" rel="noopener noreferrer" target="_blank">Don't Get Stuck in the "Con" Game</a></li><li><a href="https://dl.acm.org/doi/pdf/10.1145/3386269" rel="noopener noreferrer" target="_blank">The Best Place to Build a Subway</a></li><li><a href="https://arxiv.org/pdf/0909.1788.pdf" rel="noopener noreferrer" target="_blank">Building on Quicksand</a></li><li><a href="https://dl.acm.org/doi/pdf/10.1145/3080010" rel="noopener noreferrer" target="_blank">Side effects, front and center</a></li><li><a href="https://dl.acm.org/doi/pdf/10.1145/2844112" rel="noopener noreferrer" target="_blank">Immutability changes everything</a></li><li><a href="https://www.cidrdb.org/cidr2023/papers/p50-ziegler.pdf" rel="noopener noreferrer" target="_blank">Is Scalable OLTP in the Cloud a solved problem?</a></li></ul><p><br></p><p>You can find Pat on:</p><ul><li><a href="https://x.com/PatHelland" rel="noopener noreferrer" target="_blank">Twitter/X</a></li><li><a href="https://www.linkedin.com/in/pathelland/" rel="noopener noreferrer" target="_blank">LinkedIn</a></li><li><a href="https://pathelland.substack.com/" rel="noopener noreferrer" target="_blank">Scattered Thoughts on Distributed Systems</a> </li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Rui Liu | Towards Resource-adaptive Query Execution in Cloud Native Databases | #49</title>
			<itunes:title>Rui Liu | Towards Resource-adaptive Query Execution in Cloud Native Databases | #49</itunes:title>
			<pubDate>Mon, 01 Apr 2024 07:30:18 GMT</pubDate>
			<itunes:duration>53:52</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/660892677aefcb0016454065/media.mp3" length="51726464" type="audio/mpeg"/>
			<guid isPermaLink="false">660892677aefcb0016454065</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.cidrdb.org/cidr2024/papers/p34-liu.pdf</link>
			<acast:episodeId>660892677aefcb0016454065</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>rui-liu-towards-resource-adaptive-query-execution</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4JPL/cCQPqOCA/7n8ihsxTlSqwbj88YKUHTWXiUiIms2tlAQfQuGOZwLMvt0XUW3mQGNYRKjqqLOYhKNq+PXlkJ]]></acast:settings>
			<itunes:subtitle><![CDATA[CIDR'24]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>9</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1711837267059-3026aca649aebb528721021052f55bad.jpeg"/>
			<description><![CDATA[<p>In this episode, we talk to Rui Liu and explore the transformative potential of Ratchet, a groundbreaking resource-adaptive query execution framework. We delve into the challenges posed by ephemeral resources in modern cloud environments and the innovative solutions offered by Ratchet. Rui guides us through the intricacies of Ratchet's design, highlighting its ability to enable adaptive query suspension and resumption, sophisticated resource arbitration for diverse workloads, and a fine-grained pricing model to navigate fluctuating resource availability. Join us as we uncover the future of cloud-native databases and workloads, and discover how Ratchet is poised to revolutionize the way we harness the power of dynamic cloud resources.</p><br><p>Links:</p><ul><li><a href="https://www.cidrdb.org/cidr2024/papers/p34-liu.pdf" rel="noopener noreferrer" target="_blank">CIDR'24 Paper</a></li><li><a href="https://www.linkedin.com/in/csruiliu/" rel="noopener noreferrer" target="_blank">Rui's LinkedIn </a></li><li><a href="https://twitter.com/csruiliu" rel="noopener noreferrer" target="_blank">Rui's Twitter/X</a></li><li><a href="https://csruiliu.github.io/" rel="noopener noreferrer" target="_blank">Rui's Homepage</a></li></ul><p>You can find links to all Rui's work from his <a href="https://scholar.google.com/citations?hl=en&amp;user=2dQqHgcAAAAJ" rel="noopener noreferrer" target="_blank">Google Scholar</a> profile.</p><br><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode, we talk to Rui Liu and explore the transformative potential of Ratchet, a groundbreaking resource-adaptive query execution framework. We delve into the challenges posed by ephemeral resources in modern cloud environments and the innovative solutions offered by Ratchet. Rui guides us through the intricacies of Ratchet's design, highlighting its ability to enable adaptive query suspension and resumption, sophisticated resource arbitration for diverse workloads, and a fine-grained pricing model to navigate fluctuating resource availability. Join us as we uncover the future of cloud-native databases and workloads, and discover how Ratchet is poised to revolutionize the way we harness the power of dynamic cloud resources.</p><br><p>Links:</p><ul><li><a href="https://www.cidrdb.org/cidr2024/papers/p34-liu.pdf" rel="noopener noreferrer" target="_blank">CIDR'24 Paper</a></li><li><a href="https://www.linkedin.com/in/csruiliu/" rel="noopener noreferrer" target="_blank">Rui's LinkedIn </a></li><li><a href="https://twitter.com/csruiliu" rel="noopener noreferrer" target="_blank">Rui's Twitter/X</a></li><li><a href="https://csruiliu.github.io/" rel="noopener noreferrer" target="_blank">Rui's Homepage</a></li></ul><p>You can find links to all Rui's work from his <a href="https://scholar.google.com/citations?hl=en&amp;user=2dQqHgcAAAAJ" rel="noopener noreferrer" target="_blank">Google Scholar</a> profile.</p><br><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Yifei Yang | Predicate Transfer: Efficient Pre-Filtering on Multi-Join Queries | #48</title>
			<itunes:title>Yifei Yang | Predicate Transfer: Efficient Pre-Filtering on Multi-Join Queries | #48</itunes:title>
			<pubDate>Mon, 18 Mar 2024 06:31:13 GMT</pubDate>
			<itunes:duration>47:37</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/65f753183cf4df0017d81b2e/media.mp3" length="45725824" type="audio/mpeg"/>
			<guid isPermaLink="false">65f753183cf4df0017d81b2e</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.cidrdb.org/cidr2024/papers/p22-yang.pdf</link>
			<acast:episodeId>65f753183cf4df0017d81b2e</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>yifei-yang-predicate-transfer</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4IJZj1u+iqrwp/OFKOhjryYDhMXNEPRIj/M3E278S3WTovpj3JR7e+/jgoqFcpzDtnrfUwN0HW1Gnv9X3Lc2A+U]]></acast:settings>
			<itunes:subtitle><![CDATA[CIDR'24]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>8</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1710707025485-e64f9e7ef8e3c30b35f1d47f8bca2d5b.jpeg"/>
			<description><![CDATA[<p>In this episode, Yifei Yang introduces predicate transfer, a revolutionary method for optimizing join performance in databases. Predicate transfer builds on Bloom joins, extending its benefits to multi-table joins. Inspired by Yannakakis's theoretical insights, predicate transfer leverages Bloom filters to achieve significant speed improvements. Yang's evaluation shows an average 3.3× performance boost over Bloom join on the TPC-H benchmark, highlighting the potential of predicate transfer to revolutionize database query optimization. Join us as we explore the transformative impact of predicate transfer on database operations.</p><br><p>Links:</p><ul><li><a href="https://www.cidrdb.org/cidr2024/papers/p22-yang.pdf" rel="noopener noreferrer" target="_blank">CIDR'24 Paper</a></li><li><a href="https://www.linkedin.com/in/yifei-yang-551345161/" rel="noopener noreferrer" target="_blank">Yifei's LinkedIn</a></li><li><a href="https://www.buymeacoffee.com/disseminate" rel="noopener noreferrer" target="_blank">Buy Me A Coffee</a></li><li><a href="https://docs.google.com/forms/d/e/1FAIpQLSd_y4wD0OSYjFLGeqOLmSRLNCpbWZBaKdIA16IWY1uk6Kte1w/viewform" rel="noopener noreferrer" target="_blank">Listener Survey</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode, Yifei Yang introduces predicate transfer, a revolutionary method for optimizing join performance in databases. Predicate transfer builds on Bloom joins, extending its benefits to multi-table joins. Inspired by Yannakakis's theoretical insights, predicate transfer leverages Bloom filters to achieve significant speed improvements. Yang's evaluation shows an average 3.3× performance boost over Bloom join on the TPC-H benchmark, highlighting the potential of predicate transfer to revolutionize database query optimization. Join us as we explore the transformative impact of predicate transfer on database operations.</p><br><p>Links:</p><ul><li><a href="https://www.cidrdb.org/cidr2024/papers/p22-yang.pdf" rel="noopener noreferrer" target="_blank">CIDR'24 Paper</a></li><li><a href="https://www.linkedin.com/in/yifei-yang-551345161/" rel="noopener noreferrer" target="_blank">Yifei's LinkedIn</a></li><li><a href="https://www.buymeacoffee.com/disseminate" rel="noopener noreferrer" target="_blank">Buy Me A Coffee</a></li><li><a href="https://docs.google.com/forms/d/e/1FAIpQLSd_y4wD0OSYjFLGeqOLmSRLNCpbWZBaKdIA16IWY1uk6Kte1w/viewform" rel="noopener noreferrer" target="_blank">Listener Survey</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Vikramank Singh | Panda: Performance Debugging for Databases using LLM Agents | #47</title>
			<itunes:title>Vikramank Singh | Panda: Performance Debugging for Databases using LLM Agents | #47</itunes:title>
			<pubDate>Mon, 04 Mar 2024 08:08:54 GMT</pubDate>
			<itunes:duration>1:08:12</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/65e0e7c7babcda0016753831/media.mp3" length="65478784" type="audio/mpeg"/>
			<guid isPermaLink="false">65e0e7c7babcda0016753831</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.cidrdb.org/cidr2024/papers/p6-singh.pdf</link>
			<acast:episodeId>65e0e7c7babcda0016753831</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>vikramank-singh-panda-performance-debugging-for-databases-us</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4LywbIx1OjsiRgG93tRCLIJAqRFSEGBsNKKyalLwNSfuLS6oCL67IEPggiCDwaTY/E6atGRw6oCk66a6/WLtb9e]]></acast:settings>
			<itunes:subtitle><![CDATA[CIDR'24]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>7</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1709237795404-66f4a827e786e84183e17af02ffd6398.jpeg"/>
			<description><![CDATA[<p>In this episode, Vikramank Singh introduces the Panda framework, aimed at refining Large Language Models' (LLMs) capability to address database performance issues. Vikramank elaborates on Panda's four components—Grounding, Verification, Affordance, and Feedback—illustrating how they collaborate to contextualize LLM responses and deliver actionable recommendations. By bridging the divide between technical knowledge and practical troubleshooting needs, Panda has the potential to revolutionize database debugging practices, offering a promising avenue for more effective and efficient resolution of performance challenges in database systems. Tune in to learn more! </p><br><p>Links:</p><ul><li><a href="https://www.cidrdb.org/cidr2024/papers/p6-singh.pdf" rel="noopener noreferrer" target="_blank">CIDR'24 Paper</a></li><li><a href="https://www.linkedin.com/in/vikramanksingh/" rel="noopener noreferrer" target="_blank">Vikramank's LinkedIn</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode, Vikramank Singh introduces the Panda framework, aimed at refining Large Language Models' (LLMs) capability to address database performance issues. Vikramank elaborates on Panda's four components—Grounding, Verification, Affordance, and Feedback—illustrating how they collaborate to contextualize LLM responses and deliver actionable recommendations. By bridging the divide between technical knowledge and practical troubleshooting needs, Panda has the potential to revolutionize database debugging practices, offering a promising avenue for more effective and efficient resolution of performance challenges in database systems. Tune in to learn more! </p><br><p>Links:</p><ul><li><a href="https://www.cidrdb.org/cidr2024/papers/p6-singh.pdf" rel="noopener noreferrer" target="_blank">CIDR'24 Paper</a></li><li><a href="https://www.linkedin.com/in/vikramanksingh/" rel="noopener noreferrer" target="_blank">Vikramank's LinkedIn</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Tamer Eldeeb | Chablis: Fast and General Transactions in Geo-Distributed Systems | #46</title>
			<itunes:title>Tamer Eldeeb | Chablis: Fast and General Transactions in Geo-Distributed Systems | #46</itunes:title>
			<pubDate>Mon, 12 Feb 2024 21:09:12 GMT</pubDate>
			<itunes:duration>1:02:27</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/65ca88f82964880017c4ecea/media.mp3" length="59959424" type="audio/mpeg"/>
			<guid isPermaLink="false">65ca88f82964880017c4ecea</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.cidrdb.org/cidr2024/papers/p4-eldeeb.pdf</link>
			<acast:episodeId>65ca88f82964880017c4ecea</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>tamer-eldeeb-chablis</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4JeVXi7ndXaOOFRx+tC2Ch/AdI7Pl/XCzbHn6cLc4fUwHCqg00By6ZJkUpjrBkNcC9Dn2mg3CwZC6R3QZ/UH7P3]]></acast:settings>
			<itunes:subtitle><![CDATA[CIDR'24]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>6</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1707771796544-af32c704ad4c76e78508e22d1a2dd6a5.jpeg"/>
			<description><![CDATA[<p>In this episode, Tamer Eldeeb sheds light on the challenges faced by geo-distributed database management systems (DBMSes) in supporting strictly-serializable transactions across multiple regions. He discusses the compromises often made between low-latency regional writes and restricted programming models in existing DBMS solutions. Tamer introduces Chablis, a groundbreaking geo-distributed, multi-versioned transactional key-value store designed to overcome these limitations.</p><p>Chablis offers a general interface accommodating range and point reads, along with writes within multi-step strictly-serializable ACID transactions. Leveraging advancements in low-latency datacenter networks and innovative DBMS designs, Chablis eliminates the need for compromises, ensuring fast read-write transactions with low latency within a single region, while enabling global strictly-serializable lock-free snapshot reads. Join us as we explore the transformative potential of Chablis in revolutionizing the landscape of geo-distributed DBMSes and facilitating seamless transactional operations across distributed environments.</p><p><br></p><ul><li><a href="https://www.cidrdb.org/cidr2024/papers/p4-eldeeb.pdf" rel="noopener noreferrer" target="_blank">CIDR'24 Chablis Paper</a></li><li><a href="https://www.usenix.org/system/files/osdi23-eldeeb.pdf" rel="noopener noreferrer" target="_blank">OSDI'23 Chardonnay paper</a></li><li><a href="https://www.linkedin.com/in/tamer-eldeeb-bb691a20/" rel="noopener noreferrer" target="_blank">Tamer's Linkedin</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>In this episode, Tamer Eldeeb sheds light on the challenges faced by geo-distributed database management systems (DBMSes) in supporting strictly-serializable transactions across multiple regions. He discusses the compromises often made between low-latency regional writes and restricted programming models in existing DBMS solutions. Tamer introduces Chablis, a groundbreaking geo-distributed, multi-versioned transactional key-value store designed to overcome these limitations.</p><p>Chablis offers a general interface accommodating range and point reads, along with writes within multi-step strictly-serializable ACID transactions. Leveraging advancements in low-latency datacenter networks and innovative DBMS designs, Chablis eliminates the need for compromises, ensuring fast read-write transactions with low latency within a single region, while enabling global strictly-serializable lock-free snapshot reads. Join us as we explore the transformative potential of Chablis in revolutionizing the landscape of geo-distributed DBMSes and facilitating seamless transactional operations across distributed environments.</p><p><br></p><ul><li><a href="https://www.cidrdb.org/cidr2024/papers/p4-eldeeb.pdf" rel="noopener noreferrer" target="_blank">CIDR'24 Chablis Paper</a></li><li><a href="https://www.usenix.org/system/files/osdi23-eldeeb.pdf" rel="noopener noreferrer" target="_blank">OSDI'23 Chardonnay paper</a></li><li><a href="https://www.linkedin.com/in/tamer-eldeeb-bb691a20/" rel="noopener noreferrer" target="_blank">Tamer's Linkedin</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Matt Butrovich | Tigger: A Database Proxy That Bounces With User-Bypass | #45</title>
			<itunes:title>Matt Butrovich | Tigger: A Database Proxy That Bounces With User-Bypass | #45</itunes:title>
			<pubDate>Mon, 18 Dec 2023 08:04:38 GMT</pubDate>
			<itunes:duration>1:03:55</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/657f4efbec32d70017d5e73f/media.mp3" length="61368448" type="audio/mpeg"/>
			<guid isPermaLink="false">657f4efbec32d70017d5e73f</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.vldb.org/pvldb/vol16/p3335-butrovich.pdf</link>
			<acast:episodeId>657f4efbec32d70017d5e73f</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>matt-butrovich-tigger-a-database-proxy-that-bounces-with-use</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4Kyl4oM1toPj1B5H/KbQvW+I3869KsvGOCobQdNSeYNUSgmXG85mI/xUERPt6SCR+8g3uMpLCiRuX0MqGbCM4p/]]></acast:settings>
			<itunes:subtitle><![CDATA[VLDB'23]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>5</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1702841610699-4758c0dd33d2b2d87ab1b310dbed6162.jpeg"/>
			<description><![CDATA[<h3>Summary: </h3><p><br></p><p>In this episode, we chat to Matt Butrovich about his research on database proxies. We discuss the inefficiencies of traditional database proxies, which operate in user-space, causing overhead due to buffer copying and system calls. Matt introduces "user-bypass" which leverages Linux's eBPF infrastructure to move application logic into kernel-space. Matt then tells us about Tigger, a PostgreSQL-compatible DBMS proxy, showcasing user-bypass benefits. Tune in to hear about the experiments that demonstrate how Tigger can achieve up to a 29% reduction in transaction latencies and a 42% reduction in CPU utilization compared to other widely-used proxies.</p><p><br></p><h3>Links: </h3><ul><li><a href="https://mattbutrovi.ch/" rel="noopener noreferrer" target="_blank">Matt's homepage</a></li><li><a href="https://www.vldb.org/pvldb/vol16/p3335-butrovich.pdf" rel="noopener noreferrer" target="_blank">VLDB'23 paper</a></li><li><a href="https://github.com/mbutrovich/tigger" rel="noopener noreferrer" target="_blank">Tigger's Github repo</a></li></ul><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary: </h3><p><br></p><p>In this episode, we chat to Matt Butrovich about his research on database proxies. We discuss the inefficiencies of traditional database proxies, which operate in user-space, causing overhead due to buffer copying and system calls. Matt introduces "user-bypass" which leverages Linux's eBPF infrastructure to move application logic into kernel-space. Matt then tells us about Tigger, a PostgreSQL-compatible DBMS proxy, showcasing user-bypass benefits. Tune in to hear about the experiments that demonstrate how Tigger can achieve up to a 29% reduction in transaction latencies and a 42% reduction in CPU utilization compared to other widely-used proxies.</p><p><br></p><h3>Links: </h3><ul><li><a href="https://mattbutrovi.ch/" rel="noopener noreferrer" target="_blank">Matt's homepage</a></li><li><a href="https://www.vldb.org/pvldb/vol16/p3335-butrovich.pdf" rel="noopener noreferrer" target="_blank">VLDB'23 paper</a></li><li><a href="https://github.com/mbutrovich/tigger" rel="noopener noreferrer" target="_blank">Tigger's Github repo</a></li></ul><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Gábor Szárnyas | The LDBC Social Network Benchmark: Business Intelligence Workload | #44</title>
			<itunes:title>Gábor Szárnyas | The LDBC Social Network Benchmark: Business Intelligence Workload | #44</itunes:title>
			<pubDate>Mon, 04 Dec 2023 08:04:20 GMT</pubDate>
			<itunes:duration>46:34</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/656c90c86ce42b001296f0da/media.mp3" length="44712064" type="audio/mpeg"/>
			<guid isPermaLink="false">656c90c86ce42b001296f0da</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://dl.acm.org/doi/abs/10.14778/3574245.3574270</link>
			<acast:episodeId>656c90c86ce42b001296f0da</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>gabor-szarnyas-ldbc-bi</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4Jtlj6h77fuZxgjE8jVMwrwRgAhMESHvQ6Lp8C4HBa+A7plNSEhW1mrx3YgokI+S7Oepu9WEdxLZ6ARwZfEioyt]]></acast:settings>
			<itunes:subtitle><![CDATA[VLDB'23]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>4</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1701613130145-73183623ea744247fdabe126b7dd7132.jpeg"/>
			<description><![CDATA[<h3>Summary: </h3><p>In this episode, Gábor Szárnyas takes us on a journey through the LDBC Social Network Benchmark's Business Intelligence workload (SNB BI). Developed through collaboration between academia and industry the SNB BI is a comprehensive graph OLAP benchmark. It pushes the boundaries of synthetic and scalable analytical database benchmarks, featuring a sophisticated data generator and a temporal graph with small-world phenomena. The benchmark's query workload, rooted in LDBC's innovative design methodology, aims to drive future technical advancements in graph database systems. Gabor highlights SNB BI's unique features, including the adoption of "parameter curation" for stable query runtimes across diverse parameters. Join us for a succinct yet insightful exploration of SNB BI, where Gábor Szárnyas unveils the intricacies shaping the forefront of analytical data systems and graph workloads.</p><p><br></p><h3>Links: </h3><ul><li><a href="https://dl.acm.org/doi/abs/10.14778/3574245.3574270" rel="noopener noreferrer" target="_blank">VLDB'23 Paper</a></li><li><a href="https://szarnyasg.github.io/" rel="noopener noreferrer" target="_blank">Gabor's Homepage</a></li><li><a href="https://ldbcouncil.org/" rel="noopener noreferrer" target="_blank">LDBC Homepage</a></li><li><a href="https://github.com/ldbc" rel="noopener noreferrer" target="_blank">LDBC GitHub</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary: </h3><p>In this episode, Gábor Szárnyas takes us on a journey through the LDBC Social Network Benchmark's Business Intelligence workload (SNB BI). Developed through collaboration between academia and industry the SNB BI is a comprehensive graph OLAP benchmark. It pushes the boundaries of synthetic and scalable analytical database benchmarks, featuring a sophisticated data generator and a temporal graph with small-world phenomena. The benchmark's query workload, rooted in LDBC's innovative design methodology, aims to drive future technical advancements in graph database systems. Gabor highlights SNB BI's unique features, including the adoption of "parameter curation" for stable query runtimes across diverse parameters. Join us for a succinct yet insightful exploration of SNB BI, where Gábor Szárnyas unveils the intricacies shaping the forefront of analytical data systems and graph workloads.</p><p><br></p><h3>Links: </h3><ul><li><a href="https://dl.acm.org/doi/abs/10.14778/3574245.3574270" rel="noopener noreferrer" target="_blank">VLDB'23 Paper</a></li><li><a href="https://szarnyasg.github.io/" rel="noopener noreferrer" target="_blank">Gabor's Homepage</a></li><li><a href="https://ldbcouncil.org/" rel="noopener noreferrer" target="_blank">LDBC Homepage</a></li><li><a href="https://github.com/ldbc" rel="noopener noreferrer" target="_blank">LDBC GitHub</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Thaleia Doudali | Is Machine Learning Necessary for Cloud Resource Usage Forecasting? | #43</title>
			<itunes:title>Thaleia Doudali | Is Machine Learning Necessary for Cloud Resource Usage Forecasting? | #43</itunes:title>
			<pubDate>Mon, 20 Nov 2023 07:04:14 GMT</pubDate>
			<itunes:duration>49:13</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/655a6121c8ed51001304bcf0/media.mp3" length="47253632" type="audio/mpeg"/>
			<guid isPermaLink="false">655a6121c8ed51001304bcf0</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://dl.acm.org/doi/abs/10.1145/3620678.3624790</link>
			<acast:episodeId>655a6121c8ed51001304bcf0</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>thaleia-doudali-is-machine-learning-necessary-for-cloud-reso</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4IFGsY3eceROy8oGdHBxYLBkB4Ec2nbkMzqaQvjDwKrbgDDhiNnHTKvv2sWE5quTAnsDbM5JqNYirCAspxCXIcD]]></acast:settings>
			<itunes:subtitle><![CDATA[SoCC'23]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>3</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1700421617039-f3ecddb0915f7fc59a15b42fde8e8302.jpeg"/>
			<description><![CDATA[<h3>Summary:</h3><p><br></p><p>In this week's episode, we talk with Thaleia Doudali and explore the realm of cloud resource forecasting, focusing on the use of Long Short Term Memory (LSTM) neural networks, a popular machine learning model. Drawing from her research, Thaleia discusses the surprising discovery that, despite the complexity of ML models, accurate predictions often boil down to a simple shift of values by one time step. The discussion explores the nuances of time series data, encompassing resource metrics like CPU, memory, network, and disk I/O across different cloud providers and levels. Thaleia highlights the minimal variations observed in consecutive time steps, prompting a critical question: Do we really need complex machine learning models for effective forecasting? The episode concludes with Thaleia's vision for practical resource management systems, advocating for a thoughtful balance between simple solutions, such as data shifts, and the application of machine learning. Tune in as we unravel the layers of cloud resource forecasting with Thaleia Doudali.</p><p><br></p><h3>Links:</h3><ul><li><a href="https://dl.acm.org/doi/pdf/10.1145/3620678.3624790" rel="noopener noreferrer" target="_blank">SoCC'23 Paper</a></li><li><a href="https://thaleia-dimitradoudali.github.io/" rel="noopener noreferrer" target="_blank">Thaleia's Homepage</a></li><li><a href="https://software.imdea.org/" rel="noopener noreferrer" target="_blank">IMDEA Software Homepage</a></li><li><a href="https://github.com/muse-research-lab/cloud-forecast-data-persistence" rel="noopener noreferrer" target="_blank">GitHub Repo</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary:</h3><p><br></p><p>In this week's episode, we talk with Thaleia Doudali and explore the realm of cloud resource forecasting, focusing on the use of Long Short Term Memory (LSTM) neural networks, a popular machine learning model. Drawing from her research, Thaleia discusses the surprising discovery that, despite the complexity of ML models, accurate predictions often boil down to a simple shift of values by one time step. The discussion explores the nuances of time series data, encompassing resource metrics like CPU, memory, network, and disk I/O across different cloud providers and levels. Thaleia highlights the minimal variations observed in consecutive time steps, prompting a critical question: Do we really need complex machine learning models for effective forecasting? The episode concludes with Thaleia's vision for practical resource management systems, advocating for a thoughtful balance between simple solutions, such as data shifts, and the application of machine learning. Tune in as we unravel the layers of cloud resource forecasting with Thaleia Doudali.</p><p><br></p><h3>Links:</h3><ul><li><a href="https://dl.acm.org/doi/pdf/10.1145/3620678.3624790" rel="noopener noreferrer" target="_blank">SoCC'23 Paper</a></li><li><a href="https://thaleia-dimitradoudali.github.io/" rel="noopener noreferrer" target="_blank">Thaleia's Homepage</a></li><li><a href="https://software.imdea.org/" rel="noopener noreferrer" target="_blank">IMDEA Software Homepage</a></li><li><a href="https://github.com/muse-research-lab/cloud-forecast-data-persistence" rel="noopener noreferrer" target="_blank">GitHub Repo</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Jinkun Geng | Nezha: Deployable and High-Performance Consensus Using Synchronized Clocks | #42</title>
			<itunes:title>Jinkun Geng | Nezha: Deployable and High-Performance Consensus Using Synchronized Clocks | #42</itunes:title>
			<pubDate>Mon, 23 Oct 2023 07:04:41 GMT</pubDate>
			<itunes:duration>55:09</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/653417f030f8d6001237ad1e/media.mp3" length="52949120" type="audio/mpeg"/>
			<guid isPermaLink="false">653417f030f8d6001237ad1e</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.vldb.org/pvldb/vol16/p629-geng.pdf</link>
			<acast:episodeId>653417f030f8d6001237ad1e</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>jinkun-geng-nezha-deployable-and-high-performance-consensus-</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4IYkN9Dwz0dRtf5+j5wqliaYlApWV86lrkR0ze/uVivLDBMCtGhwxLq7BjSA2vf+ZMbutfFa4VLDlRMYWB+O52T]]></acast:settings>
			<itunes:subtitle><![CDATA[VLDB'23]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>2</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1697912176323-fe068c1bb6e703f30f7599fbb2a4fe90.jpeg"/>
			<description><![CDATA[<h3>Summary: </h3><p>In this episode Jinkun Geng talks to us about Nezha, a high-performance consensus protocol. Nezha can be deployed by cloud tenants without support from cloud providers. Nezha bridges the gap between protocols such as MultiPaxos and Raft, which can be readily deployed, and protocols such as NOPaxos and Speculative Paxos, that provide better performance, but require access to technologies such as programmable switches and in-network prioritization, which cloud tenants do not have. Tune in to learn more!  </p><p><br></p><h3>Links: </h3><ul><li><a href="https://www.gengjinkun.com/" rel="noopener noreferrer" target="_blank">Jinkun's Homepage</a></li><li><a href="https://www.vldb.org/pvldb/vol16/p629-geng.pdf" rel="noopener noreferrer" target="_blank">Nezha VLDB'23 Paper</a></li><li><a href="https://gitlab.com/steamgjk/nezhav2" rel="noopener noreferrer" target="_blank">Nezha GitLab Repo</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary: </h3><p>In this episode Jinkun Geng talks to us about Nezha, a high-performance consensus protocol. Nezha can be deployed by cloud tenants without support from cloud providers. Nezha bridges the gap between protocols such as MultiPaxos and Raft, which can be readily deployed, and protocols such as NOPaxos and Speculative Paxos, that provide better performance, but require access to technologies such as programmable switches and in-network prioritization, which cloud tenants do not have. Tune in to learn more!  </p><p><br></p><h3>Links: </h3><ul><li><a href="https://www.gengjinkun.com/" rel="noopener noreferrer" target="_blank">Jinkun's Homepage</a></li><li><a href="https://www.vldb.org/pvldb/vol16/p629-geng.pdf" rel="noopener noreferrer" target="_blank">Nezha VLDB'23 Paper</a></li><li><a href="https://gitlab.com/steamgjk/nezhav2" rel="noopener noreferrer" target="_blank">Nezha GitLab Repo</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Dimitris Koutsoukos | NVM: Is it Not Very Meaningful for Databases? | #41</title>
			<itunes:title>Dimitris Koutsoukos | NVM: Is it Not Very Meaningful for Databases? | #41</itunes:title>
			<pubDate>Mon, 09 Oct 2023 06:04:25 GMT</pubDate>
			<itunes:duration>48:57</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/6521c58c2017820011d0dae6/media.mp3" length="47007872" type="audio/mpeg"/>
			<guid isPermaLink="false">6521c58c2017820011d0dae6</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.vldb.org/pvldb/vol16/p2444-koutsoukos.pdf</link>
			<acast:episodeId>6521c58c2017820011d0dae6</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>dimitris-koutsoukos-nvm-is-it-not-very-meaningful-for-databa</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsoxhINu4Ad7VkAnsB5MGv7SjLHU7InKz+DwI+ZrtF5BMs7rtVjpA2oPbtuVg8EqZihGe2wz8pdi+RDNK0BU3Q+WHbkI7kjy5Thwu/Q7nQtuc=]]></acast:settings>
			<itunes:subtitle><![CDATA[VLDB'23]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>6</itunes:season>
			<itunes:episode>1</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1696711655666-5af79c5afa40d87fedfe44b1bdac89e2.jpeg"/>
			<description><![CDATA[<p>Summary: </p><br><p>In this episode, Dimitris Koutsoukos talks to us about Persistent or Non Volatile Memory (PMEM) and we answer the question: Is it Not Very Meaningful for Databases?&nbsp;</p><br><p>PMEM offers expanded memory capacity and faster access to persistent storage. However, (before Dimitris's work) there was no comprehensive empirical analysis of existing database engines under diferent PMEM modes, to understand how databases can benefit from the various hardware configurations. Dimitris and his colleagues have then analyzes multiple diferent engines under common benchmarks with PMEM in AppDirect mode and Memory mode - tune in to hear the findings!</p><br><p>Links:</p><ul><li><a href="https://www.vldb.org/pvldb/vol16/p2444-koutsoukos.pdf" rel="noopener noreferrer" target="_blank">VLDB'23 Paper</a></li><li><a href="https://dkoutsou.github.io/" rel="noopener noreferrer" target="_blank">Dimitris's Homepage</a></li><li><a href="https://github.com/dkoutsou/database-benchmarking-optane" rel="noopener noreferrer" target="_blank">Study's source code</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>Summary: </p><br><p>In this episode, Dimitris Koutsoukos talks to us about Persistent or Non Volatile Memory (PMEM) and we answer the question: Is it Not Very Meaningful for Databases?&nbsp;</p><br><p>PMEM offers expanded memory capacity and faster access to persistent storage. However, (before Dimitris's work) there was no comprehensive empirical analysis of existing database engines under diferent PMEM modes, to understand how databases can benefit from the various hardware configurations. Dimitris and his colleagues have then analyzes multiple diferent engines under common benchmarks with PMEM in AppDirect mode and Memory mode - tune in to hear the findings!</p><br><p>Links:</p><ul><li><a href="https://www.vldb.org/pvldb/vol16/p2444-koutsoukos.pdf" rel="noopener noreferrer" target="_blank">VLDB'23 Paper</a></li><li><a href="https://dkoutsou.github.io/" rel="noopener noreferrer" target="_blank">Dimitris's Homepage</a></li><li><a href="https://github.com/dkoutsou/database-benchmarking-optane" rel="noopener noreferrer" target="_blank">Study's source code</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Mohamed Alzayat | Groundhog: Efficient Request Isolation in FaaS | #40</title>
			<itunes:title>Mohamed Alzayat | Groundhog: Efficient Request Isolation in FaaS | #40</itunes:title>
			<pubDate>Mon, 11 Sep 2023 07:04:39 GMT</pubDate>
			<itunes:duration>42:46</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/64fd966489a2970012c7fdfd/media.mp3" length="41068888" type="audio/mpeg"/>
			<guid isPermaLink="false">64fd966489a2970012c7fdfd</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://dl.acm.org/doi/10.1145/3552326.3567503</link>
			<acast:episodeId>64fd966489a2970012c7fdfd</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>mohamed-alzayat-groundhog-efficient-request-isolation-in-faa</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsoxhINu4Ad7VkAnsB5MGv7ZIR4/zRMM7g7Iqm8X5BuD9QKu6YvQwMOYLN4fBBsBRMZ19htksOw8cKfMfoS4aTnqb06Q4Kz7ZrWwzM2MkCaqg=]]></acast:settings>
			<itunes:subtitle><![CDATA[EuroSys'23]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>5</itunes:season>
			<itunes:episode>10</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1694321006886-15f4b903e29a325a67cb077192e76995.jpeg"/>
			<description><![CDATA[<h3>Summary:</h3><p><br></p><p>Security is a core responsibility for Function-as-a-Service (FaaS) providers. The prevailing approach has each function execute in its own container to isolate concurrent executions of different functions. However, successive invocations of the same function commonly reuse the runtime state of a previous invocation in order to avoid container cold-start delays when invoking a function. Although efficient, this container reuse has security implications for functions that are invoked on behalf of differently privileged users or administrative domains: bugs in a function’s implementation, third-party library, or the language runtime may leak private data from one invocation of the function to subsequent invocations of the same function.</p><br><p>In this episode, Mohamed Alzayat tells us about Groundhog, which isolates sequential invocations of a function by efficiently reverting to a clean state, free from any private data, after each invocation. Tune in to learn more about how Groundhog works and how it improves security in FaaS!</p><p><br></p><h3>Links:</h3><ul><li><a href="https://zayat.github.io/" rel="noopener noreferrer" target="_blank">Mohamed's homepage</a></li><li><a href="https://dl.acm.org/doi/10.1145/3552326.3567503" rel="noopener noreferrer" target="_blank">Groundhog EuroSys'23 paper</a></li><li><a href="https://gitlab.mpi-sws.org/groundhog" rel="noopener noreferrer" target="_blank">Groundhog codebase</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary:</h3><p><br></p><p>Security is a core responsibility for Function-as-a-Service (FaaS) providers. The prevailing approach has each function execute in its own container to isolate concurrent executions of different functions. However, successive invocations of the same function commonly reuse the runtime state of a previous invocation in order to avoid container cold-start delays when invoking a function. Although efficient, this container reuse has security implications for functions that are invoked on behalf of differently privileged users or administrative domains: bugs in a function’s implementation, third-party library, or the language runtime may leak private data from one invocation of the function to subsequent invocations of the same function.</p><br><p>In this episode, Mohamed Alzayat tells us about Groundhog, which isolates sequential invocations of a function by efficiently reverting to a clean state, free from any private data, after each invocation. Tune in to learn more about how Groundhog works and how it improves security in FaaS!</p><p><br></p><h3>Links:</h3><ul><li><a href="https://zayat.github.io/" rel="noopener noreferrer" target="_blank">Mohamed's homepage</a></li><li><a href="https://dl.acm.org/doi/10.1145/3552326.3567503" rel="noopener noreferrer" target="_blank">Groundhog EuroSys'23 paper</a></li><li><a href="https://gitlab.mpi-sws.org/groundhog" rel="noopener noreferrer" target="_blank">Groundhog codebase</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Cuong Nguyen | Detock: High Performance Multi-region Transactions at Scale | #39</title>
			<itunes:title>Cuong Nguyen | Detock: High Performance Multi-region Transactions at Scale | #39</itunes:title>
			<pubDate>Mon, 28 Aug 2023 07:05:52 GMT</pubDate>
			<itunes:duration>37:28</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/64e537d8d3554a00111cdf48/media.mp3" length="35983488" type="audio/mpeg"/>
			<guid isPermaLink="false">64e537d8d3554a00111cdf48</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://dl.acm.org/doi/pdf/10.1145/3589293</link>
			<acast:episodeId>64e537d8d3554a00111cdf48</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>cuong-nguyen-detock-high-performance-multi-region-transactio</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4LnyWdchMK+ZFlgDJpItRdfJUAtAk1sYd8WyMu/KBMIb4v+NE6MGi4FdIsecc/pYNVKSQrFkfnUux0FCAm6Rzdq]]></acast:settings>
			<itunes:subtitle><![CDATA[SIGMOD'23]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>5</itunes:season>
			<itunes:episode>9</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1692743296669-3a675d5e964e61de4802db11a205c21c.jpeg"/>
			<description><![CDATA[<h3>Summary: </h3><p>In this episode Cuong Nguyen tells us about Detock, a geographically replicated database system. Tune in to learn about its specialised concurrency control and deadlock resolution protocols that enable processing strictly-serializable multi-region transactions with near-zero performance degradation at extremely high conflict and improves latency by up to a factor of 5.</p><br><p><br></p><h3>Links: </h3><ul><li><a href="https://dl.acm.org/doi/pdf/10.1145/3589293" rel="noopener noreferrer" target="_blank">SIGMOD Paper</a></li><li><a href="https://github.com/umd-dslam/Detock" rel="noopener noreferrer" target="_blank">Detock Github Repo</a></li><li><a href="https://ctring.github.io/" rel="noopener noreferrer" target="_blank">Cuong's Homepage</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary: </h3><p>In this episode Cuong Nguyen tells us about Detock, a geographically replicated database system. Tune in to learn about its specialised concurrency control and deadlock resolution protocols that enable processing strictly-serializable multi-region transactions with near-zero performance degradation at extremely high conflict and improves latency by up to a factor of 5.</p><br><p><br></p><h3>Links: </h3><ul><li><a href="https://dl.acm.org/doi/pdf/10.1145/3589293" rel="noopener noreferrer" target="_blank">SIGMOD Paper</a></li><li><a href="https://github.com/umd-dslam/Detock" rel="noopener noreferrer" target="_blank">Detock Github Repo</a></li><li><a href="https://ctring.github.io/" rel="noopener noreferrer" target="_blank">Cuong's Homepage</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Bogdan Stoica | WAFFLE: Exposing Memory Ordering Bugs Efficiently with Active Delay Injection | #38</title>
			<itunes:title>Bogdan Stoica | WAFFLE: Exposing Memory Ordering Bugs Efficiently with Active Delay Injection | #38</itunes:title>
			<pubDate>Mon, 14 Aug 2023 07:04:20 GMT</pubDate>
			<itunes:duration>55:57</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/64d6831d2f7c3a0011986a9d/media.mp3" length="53727360" type="audio/mpeg"/>
			<guid isPermaLink="false">64d6831d2f7c3a0011986a9d</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.microsoft.com/en-us/research/uploads/prod/2022/12/EuroSys23_camera_ready__WAFFLE_Exposing_Memory_Ordering_Bugs_Efficiently_with_Active_Delay_Injection.pdf</link>
			<acast:episodeId>64d6831d2f7c3a0011986a9d</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>bogdan-stoica-waffle-exposing-memory-ordering-bugs-efficient</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4KHNBaxiMWac1kPHxZTi4LpwV3kdFp5jL7+DJZngyPOJ2CfJU3SYVBHgUnXMMgqfUoEOa/444hjVBvkNHJ6C25c]]></acast:settings>
			<itunes:subtitle><![CDATA[EuroSys'23]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>5</itunes:season>
			<itunes:episode>8</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1691779806422-3488dca5ecd9564fdc6d08a11bd9ff3e.jpeg"/>
			<description><![CDATA[<p>Concurrency bugs are difficult to detect, reproduce, and diagnose, as they manifest under rare timing conditions. Recently,&nbsp;active delay injection has proven efficient for exposing one&nbsp;such type of bug — thread-safety violations — with low over-head, high coverage, and minimal code analysis. However,&nbsp;how to efficiently apply active delay injection to broader&nbsp;classes of concurrency bugs is still an open question.</p><br><p>In this episode, Bogdan Stoica tells us about how answered this question by focusing on&nbsp;MemOrder&nbsp;bugs — a type of concurrency bug caused by incorrect&nbsp;timing between a memory access to a particular object and&nbsp;the object’s initialization or deallocation. Tune to learn about&nbsp;Waffle&nbsp;— a delay injection tool&nbsp;that tailors key design points to better match the nature of&nbsp;MemOrder&nbsp;bugs. </p><br><p>Links: </p><ul><li><a href="https://www.microsoft.com/en-us/research/uploads/prod/2022/12/EuroSys23_camera_ready__WAFFLE_Exposing_Memory_Ordering_Bugs_Efficiently_with_Active_Delay_Injection.pdf" rel="noopener noreferrer" target="_blank">EuroSys'23 Paper</a></li><li><a href="https://bastoica.github.io/" rel="noopener noreferrer" target="_blank">Bogdan's Homepage</a></li><li><a href="https://github.com/bastoica/waffle" rel="noopener noreferrer" target="_blank">Waffle's GitHub Repo</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>Concurrency bugs are difficult to detect, reproduce, and diagnose, as they manifest under rare timing conditions. Recently,&nbsp;active delay injection has proven efficient for exposing one&nbsp;such type of bug — thread-safety violations — with low over-head, high coverage, and minimal code analysis. However,&nbsp;how to efficiently apply active delay injection to broader&nbsp;classes of concurrency bugs is still an open question.</p><br><p>In this episode, Bogdan Stoica tells us about how answered this question by focusing on&nbsp;MemOrder&nbsp;bugs — a type of concurrency bug caused by incorrect&nbsp;timing between a memory access to a particular object and&nbsp;the object’s initialization or deallocation. Tune to learn about&nbsp;Waffle&nbsp;— a delay injection tool&nbsp;that tailors key design points to better match the nature of&nbsp;MemOrder&nbsp;bugs. </p><br><p>Links: </p><ul><li><a href="https://www.microsoft.com/en-us/research/uploads/prod/2022/12/EuroSys23_camera_ready__WAFFLE_Exposing_Memory_Ordering_Bugs_Efficiently_with_Active_Delay_Injection.pdf" rel="noopener noreferrer" target="_blank">EuroSys'23 Paper</a></li><li><a href="https://bastoica.github.io/" rel="noopener noreferrer" target="_blank">Bogdan's Homepage</a></li><li><a href="https://github.com/bastoica/waffle" rel="noopener noreferrer" target="_blank">Waffle's GitHub Repo</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title> Roger Waleffe | MariusGNN: Resource-Efficient Out-of-Core Training of Graph Neural Networks | #37</title>
			<itunes:title> Roger Waleffe | MariusGNN: Resource-Efficient Out-of-Core Training of Graph Neural Networks | #37</itunes:title>
			<pubDate>Mon, 31 Jul 2023 07:05:21 GMT</pubDate>
			<itunes:duration>1:13:06</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/64c635e8fc49e20011956bea/media.mp3" length="70183040" type="audio/mpeg"/>
			<guid isPermaLink="false">64c635e8fc49e20011956bea</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://arxiv.org/abs/2202.02365</link>
			<acast:episodeId>64c635e8fc49e20011956bea</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>roger-waleffe-mariusgnn-resource-efficient-out-of-core-train</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4IDyuyDGcpdpLAmWiB1HcmBvc3bi6feDSEvJrBSe6d2NlZAehv3RKWBwhPkO9gKl0HqAJ17mh2H+LP6cVWxgTIq]]></acast:settings>
			<itunes:subtitle><![CDATA[EuroSys'23]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>5</itunes:season>
			<itunes:episode>7</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1690710063978-668a16ca0eb5de71abf84df05f946861.jpeg"/>
			<description><![CDATA[<h3>Summary: </h3><p>In this episode, Roger Waleffe talks about Graph Neural Networks (GNNs) for large-scale graphs. Specifically, he reveals all about MariusGNN, the first system that utilises the entire storage hierarchy (including disk) for GNN training. Tune in to find out how MaruisGNN works and just how fast it goes (and how much more cost-efficient it is!) </p><br><p>Links: </p><ul><li><a href="https://marius-project.org/" rel="noopener noreferrer" target="_blank">Marius Project</a></li><li><a href="http://www.rogerwaleffe.com/" rel="noopener noreferrer" target="_blank">Roger's Homepage</a> </li><li><a href="https://twitter.com/RWaleffe" rel="noopener noreferrer" target="_blank">Roger's Twitter</a></li><li><a href="https://arxiv.org/abs/2202.02365" rel="noopener noreferrer" target="_blank">EuroSys'23 Paper</a></li></ul><p><br></p><p>Support the podcast through <a href="https://www.buymeacoffee.com/disseminate" rel="noopener noreferrer" target="_blank">Buy Me a Coffee</a></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary: </h3><p>In this episode, Roger Waleffe talks about Graph Neural Networks (GNNs) for large-scale graphs. Specifically, he reveals all about MariusGNN, the first system that utilises the entire storage hierarchy (including disk) for GNN training. Tune in to find out how MaruisGNN works and just how fast it goes (and how much more cost-efficient it is!) </p><br><p>Links: </p><ul><li><a href="https://marius-project.org/" rel="noopener noreferrer" target="_blank">Marius Project</a></li><li><a href="http://www.rogerwaleffe.com/" rel="noopener noreferrer" target="_blank">Roger's Homepage</a> </li><li><a href="https://twitter.com/RWaleffe" rel="noopener noreferrer" target="_blank">Roger's Twitter</a></li><li><a href="https://arxiv.org/abs/2202.02365" rel="noopener noreferrer" target="_blank">EuroSys'23 Paper</a></li></ul><p><br></p><p>Support the podcast through <a href="https://www.buymeacoffee.com/disseminate" rel="noopener noreferrer" target="_blank">Buy Me a Coffee</a></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Madelon Hulsebos | GitTables: A Large-Scale Corpus of Relational Tables | #36</title>
			<itunes:title>Madelon Hulsebos | GitTables: A Large-Scale Corpus of Relational Tables | #36</itunes:title>
			<pubDate>Mon, 17 Jul 2023 07:04:34 GMT</pubDate>
			<itunes:duration>45:54</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/64b2dfba955b15001155f6f4/media.mp3" length="44079232" type="audio/mpeg"/>
			<guid isPermaLink="false">64b2dfba955b15001155f6f4</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://arxiv.org/pdf/2106.07258.pdf</link>
			<acast:episodeId>64b2dfba955b15001155f6f4</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>madelon-hulsebos-gittables</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4KYOjcttMU/WocBWZprfK1cT7J9KAAlCfa3ADsaZJqyI0cSIUFWOIxNe4pwnVtx1NlTn+A5GGj+X+A43pY98GR7]]></acast:settings>
			<itunes:subtitle><![CDATA[SIGMOD'23]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>5</itunes:season>
			<itunes:episode>6</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1689443702837-b5fe1c001d50237792f9f0c430be6b82.jpeg"/>
			<description><![CDATA[<h3>Summary:</h3><p>The success of deep learning has sparked interest in improving relational table tasks, like data preparation and search, with table representation models trained on large table corpora. Existing table corpora primarily contain tables extracted from HTML pages, limiting the capability to represent offline database tables. To train and evaluate high-capacity models for applications beyond the Web, we need resources with tables that resemble relational database tables. In this episode, Madelon Hulsebos tells us all about such a resource! Tune in to learn more about GitTables!! </p><p><br></p><h3>Links: </h3><ul><li><a href="https://www.madelonhulsebos.com/" rel="noopener noreferrer" target="_blank">Madelon's website</a></li><li><a href="https://gittables.github.io" rel="noopener noreferrer" target="_blank">GitTables homepage</a></li><li><a href="https://arxiv.org/pdf/2106.07258.pdf" rel="noopener noreferrer" target="_blank">SIGMOD'23 paper</a></li></ul><p><br></p><p><a href="https://www.buymeacoffee.com/disseminate" rel="noopener noreferrer" target="_blank">Buy Me A Coffee! </a></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary:</h3><p>The success of deep learning has sparked interest in improving relational table tasks, like data preparation and search, with table representation models trained on large table corpora. Existing table corpora primarily contain tables extracted from HTML pages, limiting the capability to represent offline database tables. To train and evaluate high-capacity models for applications beyond the Web, we need resources with tables that resemble relational database tables. In this episode, Madelon Hulsebos tells us all about such a resource! Tune in to learn more about GitTables!! </p><p><br></p><h3>Links: </h3><ul><li><a href="https://www.madelonhulsebos.com/" rel="noopener noreferrer" target="_blank">Madelon's website</a></li><li><a href="https://gittables.github.io" rel="noopener noreferrer" target="_blank">GitTables homepage</a></li><li><a href="https://arxiv.org/pdf/2106.07258.pdf" rel="noopener noreferrer" target="_blank">SIGMOD'23 paper</a></li></ul><p><br></p><p><a href="https://www.buymeacoffee.com/disseminate" rel="noopener noreferrer" target="_blank">Buy Me A Coffee! </a></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Tarikul Islam Papon | ACEing the Bufferpool Management Paradigm for Modern Storage Devices | #35</title>
			<itunes:title>Tarikul Islam Papon | ACEing the Bufferpool Management Paradigm for Modern Storage Devices | #35</itunes:title>
			<pubDate>Tue, 20 Jun 2023 15:31:59 GMT</pubDate>
			<itunes:duration>47:18</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/6491c670b4a5ea001183219e/media.mp3" length="45422720" type="audio/mpeg"/>
			<guid isPermaLink="false">6491c670b4a5ea001183219e</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://cs-people.bu.edu/papon/pdfs/icde23-papon.pdf</link>
			<acast:episodeId>6491c670b4a5ea001183219e</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>tarikul-islam-papon-aceing-the-bufferpool-management-paradig</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4KhLtgn7EMyeMZopPAB0/5KWwHQuEIyZn+dz+aWnqMksOW2ma/NCdjCPmSutGKwTczC8H0QPcwYpPCuTxGX+5R/]]></acast:settings>
			<itunes:subtitle><![CDATA[ICDE'23]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>5</itunes:season>
			<itunes:episode>5</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1658226600118-c8a7fa0e10288202fba2d9721149a154.jpeg"/>
			<description><![CDATA[<h3>Summary:</h3><p>Compared to hard disk drives (HDDs), solid-state drives (SSDs) have two fundamentally different properties: (i) read/write asymmetry (writes are slower than reads) and (ii) access concurrency (multiple I/Os can be executed in parallel to saturate the device bandwidth). But, database operators are often designed without considering storage asymmetry and concurrency resulting in device under utilization. In thie episode, Tarikul Islam Papon tells us about his work on a new Asymmetry &amp; Concurrency aware bufferpool management (ACE) that batches writes based on device concurrency and performs them in parallel to amortize the asymmetric write cost. Tune in to learn more! </p><p><br></p><h3>Links:</h3><ul><li><a href="https://cs-people.bu.edu/papon/pdfs/icde23-papon.pdf" rel="noopener noreferrer" target="_blank">ICDE'23 Paper</a></li><li><a href="https://cs-people.bu.edu/papon/" rel="noopener noreferrer" target="_blank">Papon's Homepage</a></li><li><a href="https://cs-people.bu.edu/papon/" rel="noopener noreferrer" target="_blank">Papon's LinkedIn</a></li></ul><p><br></p><p><a href="https://www.buymeacoffee.com/disseminate" rel="noopener noreferrer" target="_blank">Buy me a coffee</a>  </p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary:</h3><p>Compared to hard disk drives (HDDs), solid-state drives (SSDs) have two fundamentally different properties: (i) read/write asymmetry (writes are slower than reads) and (ii) access concurrency (multiple I/Os can be executed in parallel to saturate the device bandwidth). But, database operators are often designed without considering storage asymmetry and concurrency resulting in device under utilization. In thie episode, Tarikul Islam Papon tells us about his work on a new Asymmetry &amp; Concurrency aware bufferpool management (ACE) that batches writes based on device concurrency and performs them in parallel to amortize the asymmetric write cost. Tune in to learn more! </p><p><br></p><h3>Links:</h3><ul><li><a href="https://cs-people.bu.edu/papon/pdfs/icde23-papon.pdf" rel="noopener noreferrer" target="_blank">ICDE'23 Paper</a></li><li><a href="https://cs-people.bu.edu/papon/" rel="noopener noreferrer" target="_blank">Papon's Homepage</a></li><li><a href="https://cs-people.bu.edu/papon/" rel="noopener noreferrer" target="_blank">Papon's LinkedIn</a></li></ul><p><br></p><p><a href="https://www.buymeacoffee.com/disseminate" rel="noopener noreferrer" target="_blank">Buy me a coffee</a>  </p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title> Jian Zhang | VIPER: A Fast Snapshot Isolation Checker | #34</title>
			<itunes:title> Jian Zhang | VIPER: A Fast Snapshot Isolation Checker | #34</itunes:title>
			<pubDate>Fri, 09 Jun 2023 07:03:50 GMT</pubDate>
			<itunes:duration>42:34</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/6481ecb4dc05b9001166eeea/media.mp3" length="40870357" type="audio/mpeg"/>
			<guid isPermaLink="false">6481ecb4dc05b9001166eeea</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://dl.acm.org/doi/pdf/10.1145/3552326.3567492</link>
			<acast:episodeId>6481ecb4dc05b9001166eeea</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>jian-zhang-viper-a-fast-snapshot-isolation-checker-34</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4IkRIh6EYo44UiJ516EvxFa/IJhjDPE2yWriIHzD762bTN7CO43wOqrorCHYLJd/yo0h/IWAz7KOrwEgi+ve/dV]]></acast:settings>
			<itunes:subtitle><![CDATA[EuroSys'23]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>5</itunes:season>
			<itunes:episode>4</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1658226600118-c8a7fa0e10288202fba2d9721149a154.jpeg"/>
			<description><![CDATA[<h3>Summary:</h3><p>Snapshot isolation is supported by most commercial databases and is widely used by applications. However, checking, if given a set of transactions, a database ensures Snapshot Isolation is either slow or gives up soundness. In this episode, Jian Zhang tells us about VIPER, an SI checker that is sound, complete, and fast. Tune in to learn more!! </p><p><br></p><h3>Links:</h3><ul><li><a href="https://dl.acm.org/doi/pdf/10.1145/3552326.3567492" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://github.com/Khoury-srg/Viper" rel="noopener noreferrer" target="_blank">GitHub repo</a></li><li><a href="https://www.khoury.northeastern.edu/people/jian-zhang/" rel="noopener noreferrer" target="_blank">Jian's homepage</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary:</h3><p>Snapshot isolation is supported by most commercial databases and is widely used by applications. However, checking, if given a set of transactions, a database ensures Snapshot Isolation is either slow or gives up soundness. In this episode, Jian Zhang tells us about VIPER, an SI checker that is sound, complete, and fast. Tune in to learn more!! </p><p><br></p><h3>Links:</h3><ul><li><a href="https://dl.acm.org/doi/pdf/10.1145/3552326.3567492" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://github.com/Khoury-srg/Viper" rel="noopener noreferrer" target="_blank">GitHub repo</a></li><li><a href="https://www.khoury.northeastern.edu/people/jian-zhang/" rel="noopener noreferrer" target="_blank">Jian's homepage</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Ahmed Sayed | REFL: Resource Efficient Federated Learning | #33</title>
			<itunes:title>Ahmed Sayed | REFL: Resource Efficient Federated Learning | #33</itunes:title>
			<pubDate>Fri, 26 May 2023 07:03:35 GMT</pubDate>
			<itunes:duration>58:53</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/646fdff476120a001176716c/media.mp3" length="56537216" type="audio/mpeg"/>
			<guid isPermaLink="false">646fdff476120a001176716c</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://dl.acm.org/doi/pdf/10.1145/3552326.3567485</link>
			<acast:episodeId>646fdff476120a001176716c</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>ahmed-sayed-refl-resource-efficient-federated-learning-33</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4K9+HNJt5GcGAbOiyVOze97whRsDCN/cY8gIuTJD5wVrsXzffZV8VMwF8NN/yS53QeBBf+AgOCvFlSXHNiLes5Q]]></acast:settings>
			<itunes:subtitle><![CDATA[EuroSys'23]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>5</itunes:season>
			<itunes:episode>3</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1658226600118-c8a7fa0e10288202fba2d9721149a154.jpeg"/>
			<description><![CDATA[<h3>Summary: </h3><p><br></p><p>Federated Learning (FL) enables distributed training by learners using local data, thereby enhancing privacy and reducing communication. However, it presents numerous challenges relating to the heterogeneity of the data distribution, device capabilities, and participant availability as deployments scale, which can impact both model convergence and bias. Existing FL schemes use random participant selection to improve fairness; however, this can result in inefficient use of resources and lower quality training. In this episode, Ahmed Sayed talks about how he and his colleagues address the question of resource efficiency in FL. He talks about the benefits of intelligent participant selection, and incorporation of updates from straggling participants. Tune in to learn more!</p><h3><br></h3><h3>Links:</h3><ul><li><a href="https://dl.acm.org/doi/pdf/10.1145/3552326.3567485" rel="noopener noreferrer" target="_blank">EuroSys'23 Paper</a></li><li><a href="https://www.linkedin.com/in/ahmedmabdelmoniem/" rel="noopener noreferrer" target="_blank">Ahmed's LinkedIn </a></li><li><a href="http://eecs.qmul.ac.uk/~ahmed/" rel="noopener noreferrer" target="_blank">Ahmed's Homepage</a></li><li><a href=" https://twitter.com/ahmedcs982" rel="noopener noreferrer" target="_blank">Ahmed's Twitter</a></li><li><a href="https://github.com/ahmedcs/REFL" rel="noopener noreferrer" target="_blank">REFL Github</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary: </h3><p><br></p><p>Federated Learning (FL) enables distributed training by learners using local data, thereby enhancing privacy and reducing communication. However, it presents numerous challenges relating to the heterogeneity of the data distribution, device capabilities, and participant availability as deployments scale, which can impact both model convergence and bias. Existing FL schemes use random participant selection to improve fairness; however, this can result in inefficient use of resources and lower quality training. In this episode, Ahmed Sayed talks about how he and his colleagues address the question of resource efficiency in FL. He talks about the benefits of intelligent participant selection, and incorporation of updates from straggling participants. Tune in to learn more!</p><h3><br></h3><h3>Links:</h3><ul><li><a href="https://dl.acm.org/doi/pdf/10.1145/3552326.3567485" rel="noopener noreferrer" target="_blank">EuroSys'23 Paper</a></li><li><a href="https://www.linkedin.com/in/ahmedmabdelmoniem/" rel="noopener noreferrer" target="_blank">Ahmed's LinkedIn </a></li><li><a href="http://eecs.qmul.ac.uk/~ahmed/" rel="noopener noreferrer" target="_blank">Ahmed's Homepage</a></li><li><a href=" https://twitter.com/ahmedcs982" rel="noopener noreferrer" target="_blank">Ahmed's Twitter</a></li><li><a href="https://github.com/ahmedcs/REFL" rel="noopener noreferrer" target="_blank">REFL Github</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Subhadeep Sarkar | Log-structured Merge Trees | #32</title>
			<itunes:title>Subhadeep Sarkar | Log-structured Merge Trees | #32</itunes:title>
			<pubDate>Thu, 11 May 2023 07:03:40 GMT</pubDate>
			<itunes:duration>59:27</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/645c14e82d07d30011bd4f38/media.mp3" length="57077888" type="audio/mpeg"/>
			<guid isPermaLink="false">645c14e82d07d30011bd4f38</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://subhadeep.net/assets/fulltext/The_LSM_Design_Space_and_its_Read_Optimizations.pdf</link>
			<acast:episodeId>645c14e82d07d30011bd4f38</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>subhadeep-sarkar-log-structured-merge-trees-32</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4K73cNfE5jf/VQb4DPW5hs49ZaG1mEZJbLqcpAwk8QS8mUTnI5KE9F7hcCe7lYUJce8vMg/QYj31k1UNJ1WhoAk]]></acast:settings>
			<itunes:subtitle><![CDATA[ICDE'23]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>5</itunes:season>
			<itunes:episode>2</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1658226600118-c8a7fa0e10288202fba2d9721149a154.jpeg"/>
			<description><![CDATA[<h3>Summary:</h3><p>Log-structured merge (LSM) trees have emerged as one of the most commonly used storage-based data structures in modern data systems as they offer high throughput for writes and good utilization of storage space. In this episode, Subhadeep Sarkar presents the fundamental principles of the LSM paradigm. He tells us about recent research on improving write performance and the various optimization techniques and hybrid designs adopted by LSM engines to accelerate reads. Tune in to find out more! </p><p><br></p><h3>Links:</h3><ul><li><a href="https://subhadeep.net/" rel="noopener noreferrer" target="_blank">Personal website</a></li><li><a href="https://subhadeep.net/assets/fulltext/The_LSM_Design_Space_and_its_Read_Optimizations.pdf" rel="noopener noreferrer" target="_blank">ICDE'23 tutorial</a></li><li><a href="https://www.linkedin.com/in/sarkarsubhadeep/" rel="noopener noreferrer" target="_blank">LinkedIn</a> </li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary:</h3><p>Log-structured merge (LSM) trees have emerged as one of the most commonly used storage-based data structures in modern data systems as they offer high throughput for writes and good utilization of storage space. In this episode, Subhadeep Sarkar presents the fundamental principles of the LSM paradigm. He tells us about recent research on improving write performance and the various optimization techniques and hybrid designs adopted by LSM engines to accelerate reads. Tune in to find out more! </p><p><br></p><h3>Links:</h3><ul><li><a href="https://subhadeep.net/" rel="noopener noreferrer" target="_blank">Personal website</a></li><li><a href="https://subhadeep.net/assets/fulltext/The_LSM_Design_Space_and_its_Read_Optimizations.pdf" rel="noopener noreferrer" target="_blank">ICDE'23 tutorial</a></li><li><a href="https://www.linkedin.com/in/sarkarsubhadeep/" rel="noopener noreferrer" target="_blank">LinkedIn</a> </li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Andra Ionescu | Topio: The Geodata Marketplace | #31</title>
			<itunes:title>Andra Ionescu | Topio: The Geodata Marketplace | #31</itunes:title>
			<pubDate>Tue, 25 Apr 2023 07:00:55 GMT</pubDate>
			<itunes:duration>46:25</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/64470096d18c13001140840f/media.mp3" length="44562560" type="audio/mpeg"/>
			<guid isPermaLink="false">64470096d18c13001140840f</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/andra-ionescu-topio-the-geodata-marketplace-3</link>
			<acast:episodeId>64470096d18c13001140840f</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>andra-ionescu-topio-the-geodata-marketplace-3</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4LsfJtaNvSB9JiE4fiTN8j+BV/9jKolY+m68mCghcdNKBvuNINBWM67EKLpUNPL8wyZ+Nkft2C+L7iCwgjCBCg0]]></acast:settings>
			<itunes:subtitle><![CDATA[ICWE'23]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>5</itunes:season>
			<itunes:episode>1</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1658226600118-c8a7fa0e10288202fba2d9721149a154.jpeg"/>
			<description><![CDATA[<h3>Summary: </h3><p>The increasing need for data trading across businesses nowadays has created a demand for data marketplaces. However, despite the intentions of both data providers and consumers, today’s data marketplaces remain mere data catalogs. In this episode, Andra tells us about her vision for marketplaces of the future which require a set of value-added services, such as advanced search and discovery. Also, she tell us about her and her team's effort to engineer and develop an open-source modular data market platform to enable both entrepreneurs and researchers to setup and experiment with data marketplaces. Tune in to learn more about Topio a real-world web platform for trading geospatial data, that is currently in a beta phase.</p><h3><br></h3><h3>Links: </h3><ul><li><a href="https://beta.topio.market/about" rel="noopener noreferrer" target="_blank">Topio Marketplace</a></li><li><a href="https://andraionescu.github.io/" rel="noopener noreferrer" target="_blank">Andra's Homepage</a></li><li><a href="https://twitter.com/andradenisio" rel="noopener noreferrer" target="_blank">Andra's Twitter</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary: </h3><p>The increasing need for data trading across businesses nowadays has created a demand for data marketplaces. However, despite the intentions of both data providers and consumers, today’s data marketplaces remain mere data catalogs. In this episode, Andra tells us about her vision for marketplaces of the future which require a set of value-added services, such as advanced search and discovery. Also, she tell us about her and her team's effort to engineer and develop an open-source modular data market platform to enable both entrepreneurs and researchers to setup and experiment with data marketplaces. Tune in to learn more about Topio a real-world web platform for trading geospatial data, that is currently in a beta phase.</p><h3><br></h3><h3>Links: </h3><ul><li><a href="https://beta.topio.market/about" rel="noopener noreferrer" target="_blank">Topio Marketplace</a></li><li><a href="https://andraionescu.github.io/" rel="noopener noreferrer" target="_blank">Andra's Homepage</a></li><li><a href="https://twitter.com/andradenisio" rel="noopener noreferrer" target="_blank">Andra's Twitter</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Laurens Kuiper | These Rows Are Made For Sorting | #30</title>
			<itunes:title>Laurens Kuiper | These Rows Are Made For Sorting | #30</itunes:title>
			<pubDate>Wed, 12 Apr 2023 07:01:25 GMT</pubDate>
			<itunes:duration>55:01</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/64358ed299962c0011c43261/media.mp3" length="52821900" type="audio/mpeg"/>
			<guid isPermaLink="false">64358ed299962c0011c43261</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/laurens-kuiper-these-rows-are-made-for-sorting-30</link>
			<acast:episodeId>64358ed299962c0011c43261</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>laurens-kuiper-these-rows-are-made-for-sorting-30</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4JXb0R2p4k5fd5c+Mh8I63xyrf8wlDFs8FssA/mV0pwrsQmgUoKMtZVTgaeAndaoArrj6zNC5k2f38PhVicCrg+]]></acast:settings>
			<itunes:subtitle><![CDATA[ICDE'23]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>4</itunes:season>
			<itunes:episode>10</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1658226600118-c8a7fa0e10288202fba2d9721149a154.jpeg"/>
			<description><![CDATA[<h2>Summary: </h2><p>Sorting is one of the most well-studied problems in computer science and a vital operation for relational database systems. Despite this, little research has been published on implementing an efficient relational sorting operator. In this episode, Laurens Kuiper tells us about his work filling this gap! Tune in to hear about a micro-benchmarks that explores how to sort relational data efficiently for analytical database systems, taking into account different query execution engines as well as row and columnar data formats. Laurens also tells us about his implementation of a highly optimized row-based sorting approach in the DuckDB open-source in-process analytical database management system. Check out the epiosde to learn more!</p><h2><br></h2><h2>Links:</h2><ul><li><a href="https://hannes.muehleisen.org/publications/ICDE2023-sorting.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://duckdb.org/" rel="noopener noreferrer" target="_blank">DuckDB</a></li><li><a href="https://www.linkedin.com/in/lnkuiper/" rel="noopener noreferrer" target="_blank">Laurens's LinkedIn</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h2>Summary: </h2><p>Sorting is one of the most well-studied problems in computer science and a vital operation for relational database systems. Despite this, little research has been published on implementing an efficient relational sorting operator. In this episode, Laurens Kuiper tells us about his work filling this gap! Tune in to hear about a micro-benchmarks that explores how to sort relational data efficiently for analytical database systems, taking into account different query execution engines as well as row and columnar data formats. Laurens also tells us about his implementation of a highly optimized row-based sorting approach in the DuckDB open-source in-process analytical database management system. Check out the epiosde to learn more!</p><h2><br></h2><h2>Links:</h2><ul><li><a href="https://hannes.muehleisen.org/publications/ICDE2023-sorting.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://duckdb.org/" rel="noopener noreferrer" target="_blank">DuckDB</a></li><li><a href="https://www.linkedin.com/in/lnkuiper/" rel="noopener noreferrer" target="_blank">Laurens's LinkedIn</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Semih Salihoğlu | Kùzu Graph Database Management System | #29</title>
			<itunes:title>Semih Salihoğlu | Kùzu Graph Database Management System | #29</itunes:title>
			<pubDate>Mon, 03 Apr 2023 07:03:00 GMT</pubDate>
			<itunes:duration>1:17:06</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/642886b09bb1730011680437/media.mp3" length="74020992" type="audio/mpeg"/>
			<guid isPermaLink="false">642886b09bb1730011680437</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://cs.uwaterloo.ca/~ssalihog/papers/kuzu-tr.pdf</link>
			<acast:episodeId>642886b09bb1730011680437</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>semih-saliholu-kuzu-graph-database-management-system-29</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4Lm9LOtd3p952tOvGWwDNy8yrXbWgNzXWuSK/1rMHaT37oo6+5lbbRHbS472biEBzrW9SCIUPGVIVxg1W2LPTcC]]></acast:settings>
			<itunes:subtitle><![CDATA[CIDR'23]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>4</itunes:season>
			<itunes:episode>9</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1658226600118-c8a7fa0e10288202fba2d9721149a154.jpeg"/>
			<description><![CDATA[<h3>Summary: </h3><p>In this episode Semih Salihoğlu tell us about Kùzu, an in-process property graph database management system built for query speed and scalability.</p><p>Listen to hear the vision for Kùzu and to learn more about Kùzu's factorized query processor! </p><p><br></p><h3>Links:</h3><ul><li><a href="https://github.com/kuzudb/kuzu" rel="noopener noreferrer" target="_blank">Kùzu GitHub repo</a></li><li><a href="https://cs.uwaterloo.ca/~ssalihog/papers/kuzu-tr.pdf" rel="noopener noreferrer" target="_blank">CIDR paper</a></li><li><a href="mailto:contact@kuzudb.com" rel="noopener noreferrer" target="_blank">contact@kuzudb.com</a> </li><li><a href="https://join.slack.com/t/kuzudb/shared_invite/zt-1n67h736q-E3AFGSI4w~ljlFMYr3_Sjg" rel="noopener noreferrer" target="_blank">Kùzu Slack</a></li><li><a href="https://twitter.com/kuzudb" rel="noopener noreferrer" target="_blank">Kùzu Twitter</a></li><li><a href="https://kuzudb.com/" rel="noopener noreferrer" target="_blank">Kùzu Website</a> - blog posts Semih mentioned can be found here</li><li><a href="https://cs.uwaterloo.ca/~ssalihog/" rel="noopener noreferrer" target="_blank">Semih's Homepage</a></li><li><a href="https://twitter.com/semihsalihoglu" rel="noopener noreferrer" target="_blank">Semih's Twitter</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary: </h3><p>In this episode Semih Salihoğlu tell us about Kùzu, an in-process property graph database management system built for query speed and scalability.</p><p>Listen to hear the vision for Kùzu and to learn more about Kùzu's factorized query processor! </p><p><br></p><h3>Links:</h3><ul><li><a href="https://github.com/kuzudb/kuzu" rel="noopener noreferrer" target="_blank">Kùzu GitHub repo</a></li><li><a href="https://cs.uwaterloo.ca/~ssalihog/papers/kuzu-tr.pdf" rel="noopener noreferrer" target="_blank">CIDR paper</a></li><li><a href="mailto:contact@kuzudb.com" rel="noopener noreferrer" target="_blank">contact@kuzudb.com</a> </li><li><a href="https://join.slack.com/t/kuzudb/shared_invite/zt-1n67h736q-E3AFGSI4w~ljlFMYr3_Sjg" rel="noopener noreferrer" target="_blank">Kùzu Slack</a></li><li><a href="https://twitter.com/kuzudb" rel="noopener noreferrer" target="_blank">Kùzu Twitter</a></li><li><a href="https://kuzudb.com/" rel="noopener noreferrer" target="_blank">Kùzu Website</a> - blog posts Semih mentioned can be found here</li><li><a href="https://cs.uwaterloo.ca/~ssalihog/" rel="noopener noreferrer" target="_blank">Semih's Homepage</a></li><li><a href="https://twitter.com/semihsalihoglu" rel="noopener noreferrer" target="_blank">Semih's Twitter</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Lukas Vogel | Data Pipes: Declarative Control over Data Movement | #28</title>
			<itunes:title>Lukas Vogel | Data Pipes: Declarative Control over Data Movement | #28</itunes:title>
			<pubDate>Tue, 28 Mar 2023 07:02:37 GMT</pubDate>
			<itunes:duration>50:27</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/64221ef4230d7e00110e6eb7/media.mp3" length="48435837" type="audio/mpeg"/>
			<guid isPermaLink="false">64221ef4230d7e00110e6eb7</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.cidrdb.org/cidr2023/papers/p55-vogel.pdf</link>
			<acast:episodeId>64221ef4230d7e00110e6eb7</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>lukas-vogel-data-pipes-declarative-control-over-data-movemen</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4J7Dli21HVuhe3vUesFrHIC1WV7uZxQgEtDXBq8GMGT+W6Ma88BROIbRWypEgv04YK1F/VzEd1TZoc46LEtt902]]></acast:settings>
			<itunes:subtitle><![CDATA[CIDR'23]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>4</itunes:season>
			<itunes:episode>8</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1658226600118-c8a7fa0e10288202fba2d9721149a154.jpeg"/>
			<description><![CDATA[<h3>Summary:</h3><p>Today’s storage landscape offers a deep and heterogeneous stack of technologies that promises to meet even the most demanding data intensive workload needs. The diversity of technologies, however, presents a challenge. Parts of it are not controlled directly by the application, e.g., the cache layers, and the parts that are controlled, often require the programmer to deal with very different transfer mechanisms, such as disk and network APIs. Combining these different abstractions properly requires great skill, and even so, expert-written programs can lead to sub-optimal utilization of the storage stack and present performance unpredictability. In this episode, Lukas Vogel tells us how we can combat these issues with a new programming abstraction called Data Pipes. Tune in to learn more! </p><p><br></p><h3>Links:</h3><ul><li><a href="https://www.cidrdb.org/cidr2023/papers/p55-vogel.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://db.in.tum.de/~vogel/" rel="noopener noreferrer" target="_blank">Homepage</a></li><li><a href="https://twitter.com/VogelLu" rel="noopener noreferrer" target="_blank">Twitter</a></li><li><a href="https://www.linkedin.com/in/lukas-vogel-muc/?originalSubdomain=de" rel="noopener noreferrer" target="_blank">Linkedin</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary:</h3><p>Today’s storage landscape offers a deep and heterogeneous stack of technologies that promises to meet even the most demanding data intensive workload needs. The diversity of technologies, however, presents a challenge. Parts of it are not controlled directly by the application, e.g., the cache layers, and the parts that are controlled, often require the programmer to deal with very different transfer mechanisms, such as disk and network APIs. Combining these different abstractions properly requires great skill, and even so, expert-written programs can lead to sub-optimal utilization of the storage stack and present performance unpredictability. In this episode, Lukas Vogel tells us how we can combat these issues with a new programming abstraction called Data Pipes. Tune in to learn more! </p><p><br></p><h3>Links:</h3><ul><li><a href="https://www.cidrdb.org/cidr2023/papers/p55-vogel.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://db.in.tum.de/~vogel/" rel="noopener noreferrer" target="_blank">Homepage</a></li><li><a href="https://twitter.com/VogelLu" rel="noopener noreferrer" target="_blank">Twitter</a></li><li><a href="https://www.linkedin.com/in/lukas-vogel-muc/?originalSubdomain=de" rel="noopener noreferrer" target="_blank">Linkedin</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Haralampos Gavriilidis | In-Situ Cross-Database Query Processing | #27</title>
			<itunes:title>Haralampos Gavriilidis | In-Situ Cross-Database Query Processing | #27</itunes:title>
			<pubDate>Mon, 20 Mar 2023 21:33:11 GMT</pubDate>
			<itunes:duration>1:00:53</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/6418d036f2b9c500113739d5/media.mp3" length="58452096" type="audio/mpeg"/>
			<guid isPermaLink="false">6418d036f2b9c500113739d5</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.user.tu-berlin.de/harry_g/assets/cross-database-query-processing-preprint.pdf</link>
			<acast:episodeId>6418d036f2b9c500113739d5</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>haralampos-gavriilidis-in-situ-cross-database-query-processi</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4IxvSvqkJWqbry2d41AV+UeKkKHNdF+YI8BAw4pQkpOoEuV7sGT3dvfoOGCndNGrzt/v164gjGxuMrHLIZf+Z2e]]></acast:settings>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>4</itunes:season>
			<itunes:episode>7</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1658226600118-c8a7fa0e10288202fba2d9721149a154.jpeg"/>
			<description><![CDATA[<h3>Summary:</h3><p>Today’s organizations utilize a plethora of heterogeneous and autonomous DBMSes, many of those being spread across different geo-locations. It is therefore crucial to have effective and efficient cross-database query processing capabilities. In this episode, Haralampos Gavriilidis tell us about XDB, an efficient middleware system that runs cross database analytics over existing DBMSes. Tune in to learn more!</p><p><br></p><h3>Links:</h3><ul><li><a href="https://www.user.tu-berlin.de/harry_g/assets/cross-database-query-processing-preprint.pdf" rel="noopener noreferrer" target="_blank">Preprint</a></li><li><a href="https://www.user.tu-berlin.de/harry_g/" rel="noopener noreferrer" target="_blank">Haralampos's homepage</a></li></ul><p><br></p><p>Support the podcast <a href="https://www.buymeacoffee.com/disseminate" rel="noopener noreferrer" target="_blank">here</a>!</p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary:</h3><p>Today’s organizations utilize a plethora of heterogeneous and autonomous DBMSes, many of those being spread across different geo-locations. It is therefore crucial to have effective and efficient cross-database query processing capabilities. In this episode, Haralampos Gavriilidis tell us about XDB, an efficient middleware system that runs cross database analytics over existing DBMSes. Tune in to learn more!</p><p><br></p><h3>Links:</h3><ul><li><a href="https://www.user.tu-berlin.de/harry_g/assets/cross-database-query-processing-preprint.pdf" rel="noopener noreferrer" target="_blank">Preprint</a></li><li><a href="https://www.user.tu-berlin.de/harry_g/" rel="noopener noreferrer" target="_blank">Haralampos's homepage</a></li></ul><p><br></p><p>Support the podcast <a href="https://www.buymeacoffee.com/disseminate" rel="noopener noreferrer" target="_blank">here</a>!</p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title><![CDATA[Paras Jain & Sarah Wooders | Skyplane: Fast Data Transfers Between Any Cloud | #26]]></title>
			<itunes:title><![CDATA[Paras Jain & Sarah Wooders | Skyplane: Fast Data Transfers Between Any Cloud | #26]]></itunes:title>
			<pubDate>Mon, 13 Mar 2023 08:04:40 GMT</pubDate>
			<itunes:duration>46:21</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/640ce0ba54085200111b04c0/media.mp3" length="44507264" type="audio/mpeg"/>
			<guid isPermaLink="false">640ce0ba54085200111b04c0</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://skyplane.org/en/latest/</link>
			<acast:episodeId>640ce0ba54085200111b04c0</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>paras-jain-sarah-wooders-skyplane-26</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4IDP4qCE5w7jW5/FxZzQTLM3iPwzP1u2bNF59CzjibDRubv/JUF9m5V2/Ir9Imv0mpmPgUCM+ACR+1tHhCRUnpc]]></acast:settings>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>4</itunes:season>
			<itunes:episode>6</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1658226600118-c8a7fa0e10288202fba2d9721149a154.jpeg"/>
			<description><![CDATA[<h3>Summary:</h3><p>This week Paras Jain and Sarah Wooders tell us about how you can quickly data transfers between any cloud&nbsp;with Skyplane. Tune in to learn more! </p><p><br></p><h3>Links:</h3><ul><li><a href="https://skyplane.org/en/latest/" rel="noopener noreferrer" target="_blank">Skyplane homepage</a></li><li><a href="http://sarahwooders.com/" rel="noopener noreferrer" target="_blank">Sarah's homepage</a></li><li><a href="https://www.parasjain.com/" rel="noopener noreferrer" target="_blank">Paras's homepage</a></li></ul><p><br></p><p><strong>Support the podcast </strong><a href="https://www.buymeacoffee.com/disseminate" rel="noopener noreferrer" target="_blank"><strong>here</strong></a></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary:</h3><p>This week Paras Jain and Sarah Wooders tell us about how you can quickly data transfers between any cloud&nbsp;with Skyplane. Tune in to learn more! </p><p><br></p><h3>Links:</h3><ul><li><a href="https://skyplane.org/en/latest/" rel="noopener noreferrer" target="_blank">Skyplane homepage</a></li><li><a href="http://sarahwooders.com/" rel="noopener noreferrer" target="_blank">Sarah's homepage</a></li><li><a href="https://www.parasjain.com/" rel="noopener noreferrer" target="_blank">Paras's homepage</a></li></ul><p><br></p><p><strong>Support the podcast </strong><a href="https://www.buymeacoffee.com/disseminate" rel="noopener noreferrer" target="_blank"><strong>here</strong></a></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Yang Wang | Rethinking Concurrency Control in Databases | #25</title>
			<itunes:title>Yang Wang | Rethinking Concurrency Control in Databases | #25</itunes:title>
			<pubDate>Mon, 06 Mar 2023 08:03:08 GMT</pubDate>
			<itunes:duration>55:56</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/6400e40fb71a580011f006c0/media.mp3" length="53713024" type="audio/mpeg"/>
			<guid isPermaLink="false">6400e40fb71a580011f006c0</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.cidrdb.org/cidr2023/papers/p30-cheng.pdf</link>
			<acast:episodeId>6400e40fb71a580011f006c0</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>yang-wang-rethinking-concurrency-control-in-databases-25</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4IpbjPkxGZ8g4zCga5+D6wukuHiDle/ko5RMeZxoH3XO+NhsahytmplDeHXNBsmqoOH8M6GTejFYTPg0QUyrLNc]]></acast:settings>
			<itunes:subtitle><![CDATA[CIDR'23]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>4</itunes:season>
			<itunes:episode>5</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1677779954736-f7405e0c6a1e4a2e7a5a7849a83e86c0.jpeg"/>
			<description><![CDATA[<h3>Summary: </h3><p>Many database applications execute transactions under a weaker isolation level, such as READ COMMITTED. This often leads to concurrency bugs that look like race conditions in multi-threaded programs. While this problem is well known, philosophies of how to address this problem vary a lot, ranging from making a SERIALIZABLE database faster to living with weaker isolation and the consequence of concurrency bugs. In this episode, Yang talks about the consequences of these bugs, the root causes, and how developers have fixed 93 real-world concurrency bugs in database applications. Who's responsibility is it to prevent these bugs from happening? The database or the developer? Listen to find out more!</p><p><br></p><h3>Links:</h3><ul><li><a href="https://www.cidrdb.org/cidr2023/papers/p30-cheng.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://web.cse.ohio-state.edu/~wang.7564/" rel="noopener noreferrer" target="_blank">Homepage</a></li></ul><p><br></p><br><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary: </h3><p>Many database applications execute transactions under a weaker isolation level, such as READ COMMITTED. This often leads to concurrency bugs that look like race conditions in multi-threaded programs. While this problem is well known, philosophies of how to address this problem vary a lot, ranging from making a SERIALIZABLE database faster to living with weaker isolation and the consequence of concurrency bugs. In this episode, Yang talks about the consequences of these bugs, the root causes, and how developers have fixed 93 real-world concurrency bugs in database applications. Who's responsibility is it to prevent these bugs from happening? The database or the developer? Listen to find out more!</p><p><br></p><h3>Links:</h3><ul><li><a href="https://www.cidrdb.org/cidr2023/papers/p30-cheng.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://web.cse.ohio-state.edu/~wang.7564/" rel="noopener noreferrer" target="_blank">Homepage</a></li></ul><p><br></p><br><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Suyash Gupta | Chemistry behind Agreement | #24</title>
			<itunes:title>Suyash Gupta | Chemistry behind Agreement | #24</itunes:title>
			<pubDate>Mon, 27 Feb 2023 08:02:35 GMT</pubDate>
			<itunes:duration>1:03:51</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/63fb4d40188c36001149f432/media.mp3" length="61300864" type="audio/mpeg"/>
			<guid isPermaLink="false">63fb4d40188c36001149f432</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.cidrdb.org/cidr2023/papers/p85-gupta.pdf</link>
			<acast:episodeId>63fb4d40188c36001149f432</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>suyash-gupta-chemistry-behind-agreement-24</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4Kz9yACOy125O08wMRvuRU1UZ1sCu5W951wpTGUm3BeC80yH254YqfJnxZG+7KwVgyYKQ39+QEdTRD/mGyWb8/V]]></acast:settings>
			<itunes:subtitle><![CDATA[CIDR'23]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>4</itunes:season>
			<itunes:episode>4</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1677413361468-e65410165a02e4d33d7df751e20b53c0.jpeg"/>
			<description><![CDATA[<h3>Summary: </h3><p>Agreement protocols have been extensively used by distributed data management systems to provide robustness and high availability. The broad spectrum of design dimensions, applications, and fault models have resulted in different flavours of agreement protocols. This has made it hard to argue their correctness and has unintentionally created a disparity in understanding their design. In this episode, Suyash Gupta tell us about a unified framework that simplifies expressing different agreement protocols. Listen to find out more! </p><p><br></p><h3>Links: </h3><ul><li><a href="https://www.cidrdb.org/cidr2023/papers/p85-gupta.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://gupta-suyash.github.io/" rel="noopener noreferrer" target="_blank">Website</a></li><li><a href="https://twitter.com/suyash_sg" rel="noopener noreferrer" target="_blank">Twitter</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary: </h3><p>Agreement protocols have been extensively used by distributed data management systems to provide robustness and high availability. The broad spectrum of design dimensions, applications, and fault models have resulted in different flavours of agreement protocols. This has made it hard to argue their correctness and has unintentionally created a disparity in understanding their design. In this episode, Suyash Gupta tell us about a unified framework that simplifies expressing different agreement protocols. Listen to find out more! </p><p><br></p><h3>Links: </h3><ul><li><a href="https://www.cidrdb.org/cidr2023/papers/p85-gupta.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://gupta-suyash.github.io/" rel="noopener noreferrer" target="_blank">Website</a></li><li><a href="https://twitter.com/suyash_sg" rel="noopener noreferrer" target="_blank">Twitter</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Tobias Ziegler | Is Scalable OLTP in the Cloud a Solved Problem? | #23</title>
			<itunes:title>Tobias Ziegler | Is Scalable OLTP in the Cloud a Solved Problem? | #23</itunes:title>
			<pubDate>Mon, 20 Feb 2023 08:03:20 GMT</pubDate>
			<itunes:duration>55:25</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/63f0b626a72d0800115fb647/media.mp3" length="53207168" type="audio/mpeg"/>
			<guid isPermaLink="false">63f0b626a72d0800115fb647</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.cidrdb.org/cidr2023/papers/p50-ziegler.pdf</link>
			<acast:episodeId>63f0b626a72d0800115fb647</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>tobias-ziegler-is-scalable-oltp-in-the-cloud-a-solved-proble</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4I6DLWi59lnyNtISP9QUJFoddxOWaFTLPqR0FS+gHALAwKVp6NYr69bzZK+5keCvfkYzlAxW1q39ZXNnNEP25dO]]></acast:settings>
			<itunes:subtitle><![CDATA[CIDR'23]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>4</itunes:season>
			<itunes:episode>3</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1676719310192-1a3afc39a2d4cfecd6b75e912e2acc80.jpeg"/>
			<description><![CDATA[<h3>Summary: </h3><p>Many distributed cloud OLTP databases have settled on a shared-storage design coupled with a single-writer. This design choice is remarkable since conventional wisdom promotes using a shared-nothing architecture for building scalable systems. In this episode, Tobias revisits the question of what a scalable OLTP design for the cloud should look like by analysing the data access behaviour of different systems. Tune in to find out more!</p><h3><br></h3><h3>Links: </h3><ul><li><a href="https://www.cidrdb.org/cidr2023/papers/p50-ziegler.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://www.informatik.tu-darmstadt.de/systems/systems_tuda/group/team_detail_18944.en.jsp" rel="noopener noreferrer" target="_blank">Website</a></li><li><a href="https://open.acast.com/shows/629a6154b4e1e70012764c00/episodes/tobias.ziegler@cs.tu-darmstadt.de" rel="noopener noreferrer" target="_blank">Email</a>&nbsp;</li><li><a href="https://twitter.com/tobiasziegler18" rel="noopener noreferrer" target="_blank">Twitter</a></li><li><a href="https://scholar.google.com/citations?user=qJ_bkjcAAAAJ&amp;hl=en&amp;oi=ao" rel="noopener noreferrer" target="_blank">Google Scholar</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary: </h3><p>Many distributed cloud OLTP databases have settled on a shared-storage design coupled with a single-writer. This design choice is remarkable since conventional wisdom promotes using a shared-nothing architecture for building scalable systems. In this episode, Tobias revisits the question of what a scalable OLTP design for the cloud should look like by analysing the data access behaviour of different systems. Tune in to find out more!</p><h3><br></h3><h3>Links: </h3><ul><li><a href="https://www.cidrdb.org/cidr2023/papers/p50-ziegler.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://www.informatik.tu-darmstadt.de/systems/systems_tuda/group/team_detail_18944.en.jsp" rel="noopener noreferrer" target="_blank">Website</a></li><li><a href="https://open.acast.com/shows/629a6154b4e1e70012764c00/episodes/tobias.ziegler@cs.tu-darmstadt.de" rel="noopener noreferrer" target="_blank">Email</a>&nbsp;</li><li><a href="https://twitter.com/tobiasziegler18" rel="noopener noreferrer" target="_blank">Twitter</a></li><li><a href="https://scholar.google.com/citations?user=qJ_bkjcAAAAJ&amp;hl=en&amp;oi=ao" rel="noopener noreferrer" target="_blank">Google Scholar</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Hamish Nicholson | HetCache: Synergising NVMe Storage and GPU acceleration for Memory-Efficient Analytics | #22</title>
			<itunes:title>Hamish Nicholson | HetCache: Synergising NVMe Storage and GPU acceleration for Memory-Efficient Analytics | #22</itunes:title>
			<pubDate>Mon, 13 Feb 2023 08:02:50 GMT</pubDate>
			<itunes:duration>50:56</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/63e27e39ebb5ae00111613c6/media.mp3" length="48906368" type="audio/mpeg"/>
			<guid isPermaLink="false">63e27e39ebb5ae00111613c6</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.cidrdb.org/cidr2023/papers/p84-nicholson.pdf</link>
			<acast:episodeId>63e27e39ebb5ae00111613c6</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>hamish-nicholson-hetcache</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4I1zNz4dTHcBh7hzRlzB0wInDxR4pzIFK8yYHakDaQdeihhSDEF0llpVhKatPKKLM631JIXd3UL5dEdQso2IMAY]]></acast:settings>
			<itunes:subtitle><![CDATA[CIDR'23]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>4</itunes:season>
			<itunes:episode>2</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1675787737634-2b0d6e2212803229a0b526b2d405ab78.jpeg"/>
			<description><![CDATA[<h3>Summary:</h3><p>In this episode, Hamish Nicholson tells us about HetCache, a storage engine for analytical workloads that optimizes the data access paths and tunes data placement by co-optimizing for the combinations of different memories, compute devices, and queries. Specifically, we present how the increasingly complex storage hierarchy impacts analytical query processing in GPU-NVMe-accelerated servers. HetCache accelerates analytics on CPU-GPU servers for larger-than-memory datasets through proportional and access-path-aware data placement. Tune in to hear more!</p><p><br></p><h3>Links:</h3><ul><li><a href="https://www.cidrdb.org/cidr2023/papers/p84-nicholson.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://www.nicholson.ai/" rel="noopener noreferrer" target="_blank">Personal website</a></li><li><a href="https://www.linkedin.com/in/hamish-nicholson-53ba1a145/" rel="noopener noreferrer" target="_blank">LinkedIn</a></li><li><a href="https://twitter.com/HamishNicholso3" rel="noopener noreferrer" target="_blank">Twitter</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary:</h3><p>In this episode, Hamish Nicholson tells us about HetCache, a storage engine for analytical workloads that optimizes the data access paths and tunes data placement by co-optimizing for the combinations of different memories, compute devices, and queries. Specifically, we present how the increasingly complex storage hierarchy impacts analytical query processing in GPU-NVMe-accelerated servers. HetCache accelerates analytics on CPU-GPU servers for larger-than-memory datasets through proportional and access-path-aware data placement. Tune in to hear more!</p><p><br></p><h3>Links:</h3><ul><li><a href="https://www.cidrdb.org/cidr2023/papers/p84-nicholson.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://www.nicholson.ai/" rel="noopener noreferrer" target="_blank">Personal website</a></li><li><a href="https://www.linkedin.com/in/hamish-nicholson-53ba1a145/" rel="noopener noreferrer" target="_blank">LinkedIn</a></li><li><a href="https://twitter.com/HamishNicholso3" rel="noopener noreferrer" target="_blank">Twitter</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Immanuel Haffner | mutable: A Modern DBMS for Research and Fast Prototyping | #21</title>
			<itunes:title>Immanuel Haffner | mutable: A Modern DBMS for Research and Fast Prototyping | #21</itunes:title>
			<pubDate>Mon, 06 Feb 2023 08:05:29 GMT</pubDate>
			<itunes:duration>1:28:13</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/63e024f10c69480011f76db2/media.mp3" length="84693120" type="audio/mpeg"/>
			<guid isPermaLink="false">63e024f10c69480011f76db2</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.cidrdb.org/cidr2023/papers/p41-haffner.pdf</link>
			<acast:episodeId>63e024f10c69480011f76db2</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>immanuel-haffner-mutable-a-modern-dbms-for-research-and-fast</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4L1y1pEs68jkojsk+LuSYXUz03iuwQ3QepUqXQRWSMBUx+oECWuQNCt3cFC+vNVmUB6lcN8Jv7dan2a+Omcj2A5]]></acast:settings>
			<itunes:subtitle><![CDATA[CIDR'23]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>4</itunes:season>
			<itunes:episode>1</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1675633898305-8d73d4414084acf399cb5cb850b7f668.jpeg"/>
			<description><![CDATA[<h3>Summary:</h3><p>Few to zero DBMSs provide extensibility together with implementations of modern concepts, like query compilation for example. This as an impeding factor in academic research. In this episode, Immanuel Haffner, presents mutable, a system that is fitted to academic research and education. mutable features a modular design, where individual components can be composed to form a complete system. Check out the episode to learn more!</p><p><br></p><h3>Links:</h3><ul><li><a href="https://www.cidrdb.org/cidr2023/papers/p41-haffner.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://bigdata.uni-saarland.de/people/haffner.php" rel="noopener noreferrer" target="_blank">Website</a></li><li><a href="https://github.com/mutable-org/mutable" rel="noopener noreferrer" target="_blank">Mutable github repo</a></li><li><a href="https://m.xkcd.com/327/" rel="noopener noreferrer" target="_blank">Bobby Tables xkcd</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary:</h3><p>Few to zero DBMSs provide extensibility together with implementations of modern concepts, like query compilation for example. This as an impeding factor in academic research. In this episode, Immanuel Haffner, presents mutable, a system that is fitted to academic research and education. mutable features a modular design, where individual components can be composed to form a complete system. Check out the episode to learn more!</p><p><br></p><h3>Links:</h3><ul><li><a href="https://www.cidrdb.org/cidr2023/papers/p41-haffner.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://bigdata.uni-saarland.de/people/haffner.php" rel="noopener noreferrer" target="_blank">Website</a></li><li><a href="https://github.com/mutable-org/mutable" rel="noopener noreferrer" target="_blank">Mutable github repo</a></li><li><a href="https://m.xkcd.com/327/" rel="noopener noreferrer" target="_blank">Bobby Tables xkcd</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Konstantinos Kallas | Practically Correct, Just-in-Time Shell Script Parallelization | #20</title>
			<itunes:title>Konstantinos Kallas | Practically Correct, Just-in-Time Shell Script Parallelization | #20</itunes:title>
			<pubDate>Mon, 30 Jan 2023 08:05:24 GMT</pubDate>
			<itunes:duration>57:48</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/63d674454ec27b0011f853db/media.mp3" length="55492736" type="audio/mpeg"/>
			<guid isPermaLink="false">63d674454ec27b0011f853db</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://angelhof.github.io/files/papers/pashjit-2022-osdi.pdf</link>
			<acast:episodeId>63d674454ec27b0011f853db</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>konstantinos-kallas</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4IeEoO2iuhHb8GxInGYQ1X5CQS2B8tkczMODu/pcLpH4TiUYqPpcMvL2qJ9GnbqJpOYKe27M2xaqHXnRxRBqRKp]]></acast:settings>
			<itunes:subtitle><![CDATA[OSDI'22]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>3</itunes:season>
			<itunes:episode>5</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1674998807886-7517e67867768aca565a384964099b97.jpeg"/>
			<description><![CDATA[<h3>Summary: </h3><p>Recent shell-script parallelization systems enjoy mostly automated speedups by parallelizing scripts ahead-of-time. Unfortunately, such static parallelization is hampered by dynamic behavior pervasive in shell scripts—e.g., variable expansion and command substitution—which often requires reasoning about the current state of the shell and filesystem. Tune in to hear how Konstantinos Kallas and his colleagues overcame this issue (and others) with PaSH-JIT, a just-in-time (JIT) shell-script compiler!</p><p><br></p><h3>Links: </h3><ul><li><a href="https://angelhof.github.io/files/papers/pashjit-2022-osdi.pdf" rel="noopener noreferrer" target="_blank">OSDI paper</a></li><li><a href="https://angelhof.github.io/" rel="noopener noreferrer" target="_blank">Personal website</a></li><li><a href="https://twitter.com/KonsKallas" rel="noopener noreferrer" target="_blank">Twitter</a></li><li><a href="https://www.linkedin.com/in/konstantinoskallas/" rel="noopener noreferrer" target="_blank">LinkedIn</a></li><li><a href="https://binpa.sh/" rel="noopener noreferrer" target="_blank">PaSH homepage</a> (you can find all associated papers here)</li></ul><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary: </h3><p>Recent shell-script parallelization systems enjoy mostly automated speedups by parallelizing scripts ahead-of-time. Unfortunately, such static parallelization is hampered by dynamic behavior pervasive in shell scripts—e.g., variable expansion and command substitution—which often requires reasoning about the current state of the shell and filesystem. Tune in to hear how Konstantinos Kallas and his colleagues overcame this issue (and others) with PaSH-JIT, a just-in-time (JIT) shell-script compiler!</p><p><br></p><h3>Links: </h3><ul><li><a href="https://angelhof.github.io/files/papers/pashjit-2022-osdi.pdf" rel="noopener noreferrer" target="_blank">OSDI paper</a></li><li><a href="https://angelhof.github.io/" rel="noopener noreferrer" target="_blank">Personal website</a></li><li><a href="https://twitter.com/KonsKallas" rel="noopener noreferrer" target="_blank">Twitter</a></li><li><a href="https://www.linkedin.com/in/konstantinoskallas/" rel="noopener noreferrer" target="_blank">LinkedIn</a></li><li><a href="https://binpa.sh/" rel="noopener noreferrer" target="_blank">PaSH homepage</a> (you can find all associated papers here)</li></ul><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Vasily Sartakov | CAP-VMs: Capability-Based Isolation and Sharing in the Cloud #19</title>
			<itunes:title>Vasily Sartakov | CAP-VMs: Capability-Based Isolation and Sharing in the Cloud #19</itunes:title>
			<pubDate>Mon, 23 Jan 2023 08:00:50 GMT</pubDate>
			<itunes:duration>36:10</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/63cadf1eeb0a7a0010fa2856/media.mp3" length="34732160" type="audio/mpeg"/>
			<guid isPermaLink="false">63cadf1eeb0a7a0010fa2856</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.usenix.org/system/files/osdi22-sartakov.pdf</link>
			<acast:episodeId>63cadf1eeb0a7a0010fa2856</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>vasily-sartakov-cap-vms-capability-based-isolation-and-shari</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4Io9CVZLklytQEaNgH4vCGqg8G9kjaInrRg5dJ4DWQTtw/MFOAVK261xeMWrCHlDt52u3XmYfbvI20czEhlEX+/]]></acast:settings>
			<itunes:subtitle><![CDATA[OSDI'22]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>3</itunes:season>
			<itunes:episode>4</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1674239739133-dd6cbed658c075b5feb9549d3eafba13.jpeg"/>
			<description><![CDATA[<h3>Summary: </h3><p>Cloud stacks must isolate application components, while permitting efficient data sharing between components deployed on the same physical host. Traditionally, the memory management unit&nbsp;(MMU) enforces isolation and permits sharing at page granularity. MMU approaches, however, lead to cloud stacks with large trusted computing bases in kernel space, and page granularity requires inefficient OS interfaces for data sharing. Forthcoming CPUs with hardware support for&nbsp;<em>memory capabilities</em>&nbsp;offer new opportunities to implement isolation and sharing at a finer granularity. In this episode, Vasily talks about his work on <em>cVMs</em>, a new VM-like abstraction that uses memory capabilities to isolate application components while supporting efficient data sharing, all without mandating application code to be capability-aware. Listen to find out more!</p><p><br></p><h3>Links: </h3><ul><li><a href="usenix.org/system/files/osdi22-sartakov.pdf" rel="noopener noreferrer" target="_blank">OSDI Paper</a></li><li><a href="https://www.doc.ic.ac.uk/~vsartako/" rel="noopener noreferrer" target="_blank">Vasily's homepage</a></li><li><a href="linkedin.com/in/vasily-sartakov-80a0971a/?originalSubdomain=uk" rel="noopener noreferrer" target="_blank">Vasily's LinkedIn</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary: </h3><p>Cloud stacks must isolate application components, while permitting efficient data sharing between components deployed on the same physical host. Traditionally, the memory management unit&nbsp;(MMU) enforces isolation and permits sharing at page granularity. MMU approaches, however, lead to cloud stacks with large trusted computing bases in kernel space, and page granularity requires inefficient OS interfaces for data sharing. Forthcoming CPUs with hardware support for&nbsp;<em>memory capabilities</em>&nbsp;offer new opportunities to implement isolation and sharing at a finer granularity. In this episode, Vasily talks about his work on <em>cVMs</em>, a new VM-like abstraction that uses memory capabilities to isolate application components while supporting efficient data sharing, all without mandating application code to be capability-aware. Listen to find out more!</p><p><br></p><h3>Links: </h3><ul><li><a href="usenix.org/system/files/osdi22-sartakov.pdf" rel="noopener noreferrer" target="_blank">OSDI Paper</a></li><li><a href="https://www.doc.ic.ac.uk/~vsartako/" rel="noopener noreferrer" target="_blank">Vasily's homepage</a></li><li><a href="linkedin.com/in/vasily-sartakov-80a0971a/?originalSubdomain=uk" rel="noopener noreferrer" target="_blank">Vasily's LinkedIn</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Haoran Ma | MemLiner: Lining up Tracing and Application for a Far-Memory-Friendly Runtime | #18</title>
			<itunes:title>Haoran Ma | MemLiner: Lining up Tracing and Application for a Far-Memory-Friendly Runtime | #18</itunes:title>
			<pubDate>Mon, 16 Jan 2023 08:00:33 GMT</pubDate>
			<itunes:duration>44:25</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/63c291f27ae74e0010a895d4/media.mp3" length="42647680" type="audio/mpeg"/>
			<guid isPermaLink="false">63c291f27ae74e0010a895d4</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://web.cs.ucla.edu/~harryxu/papers/memliner-osdi22.pdf</link>
			<acast:episodeId>63c291f27ae74e0010a895d4</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>haoran-ma-memliner-lining-up-tracing-and-application-for-a-f</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4I4WDdBuQza1B7jMkxvrbxntAe+qdoBf0I1tP64aDOcT0RjpPlTWfIiVbPMJ+8zS7NTpaq7BDAnHqr184asFnVc]]></acast:settings>
			<itunes:subtitle><![CDATA[OSDI'22]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>3</itunes:season>
			<itunes:episode>3</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1673695418812-b41eb3022cd84e25d86f72c759bf86a3.jpeg"/>
			<description><![CDATA[<h3>Summary: </h3><p>Far-memory techniques that enable applications to use remote memory and are increasingly appealing in modern data centers, supporting applications’ large memory footprint and improving machines’ resource utilization. In this episode Haoran Ma tells us about the problems with current far-memory techniques and how they focus on OS-level optimizations and are agnostic to managed runtimes and garbage collections (GC) underneath applications written in high-level languages. Owing to different object-access patterns from applications, GC can severely interfere with existing far-memory techniques, breaking remote memory prefetching algorithms and causing severe local-memory misses. To address this Haoran and his colleagues developed MemLiner, a runtime technique that improves the performance of far-memory systems by “lining up” memory accesses from the application and the GC so that they follow similar memory access paths, thereby (1) reducing the local-memory working set and (2) improving remote-memory prefetching through simplified memory access patterns. Listen to the episode to learn more! </p><p><br></p><h3>Links: </h3><ul><li><a href="https://web.cs.ucla.edu/~harryxu/papers/memliner-osdi22.pdf" rel="noopener noreferrer" target="_blank">OSDI'22 MemLiner paper</a></li><li><a href="https://www.youtube.com/watch?v=xheC17lomr8" rel="noopener noreferrer" target="_blank">OSDI'22 Presentation </a></li><li><a href="http://www.haoranma.info/" rel="noopener noreferrer" target="_blank">Haoran's website</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary: </h3><p>Far-memory techniques that enable applications to use remote memory and are increasingly appealing in modern data centers, supporting applications’ large memory footprint and improving machines’ resource utilization. In this episode Haoran Ma tells us about the problems with current far-memory techniques and how they focus on OS-level optimizations and are agnostic to managed runtimes and garbage collections (GC) underneath applications written in high-level languages. Owing to different object-access patterns from applications, GC can severely interfere with existing far-memory techniques, breaking remote memory prefetching algorithms and causing severe local-memory misses. To address this Haoran and his colleagues developed MemLiner, a runtime technique that improves the performance of far-memory systems by “lining up” memory accesses from the application and the GC so that they follow similar memory access paths, thereby (1) reducing the local-memory working set and (2) improving remote-memory prefetching through simplified memory access patterns. Listen to the episode to learn more! </p><p><br></p><h3>Links: </h3><ul><li><a href="https://web.cs.ucla.edu/~harryxu/papers/memliner-osdi22.pdf" rel="noopener noreferrer" target="_blank">OSDI'22 MemLiner paper</a></li><li><a href="https://www.youtube.com/watch?v=xheC17lomr8" rel="noopener noreferrer" target="_blank">OSDI'22 Presentation </a></li><li><a href="http://www.haoranma.info/" rel="noopener noreferrer" target="_blank">Haoran's website</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Lexiang Huang | Metastable Failures in the Wild | #17</title>
			<itunes:title>Lexiang Huang | Metastable Failures in the Wild | #17</itunes:title>
			<pubDate>Mon, 09 Jan 2023 08:00:55 GMT</pubDate>
			<itunes:duration>53:18</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/63b9a60d1695af0011cb26f8/media.mp3" length="51177600" type="audio/mpeg"/>
			<guid isPermaLink="false">63b9a60d1695af0011cb26f8</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://www.usenix.org/conference/osdi22/presentation/huang-lexiang</link>
			<acast:episodeId>63b9a60d1695af0011cb26f8</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>lexiang-huang-metastable-failures-in-the-wild-17</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4JSyPgbovy82yV95gQWMtvE1ayIIgqCAosPIzKMJ1XSxrPqLlP1SEaMLkq3lvvMrMEOzN9Asd7On5yH74/cJPxb]]></acast:settings>
			<itunes:subtitle><![CDATA[OSDI'22]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>3</itunes:season>
			<itunes:episode>2</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1673110624189-477308bece2b6cfd1786da70fc0645be.jpeg"/>
			<description><![CDATA[<h3>Summary: </h3><p>In this episode Lexiang Huang&nbsp;talks about a framework for understanding a class of failures in distributed systems called metastable failures. Lexiang tells us about his study on the prevalence of such failures in the wild and how he and his colleagues scoured over publicly available incident reports from many organizations, ranging from hyperscalers to small companies. Listen to the episode to find out about his main findings and gain a deeper understanding of metastable failures and how you can identity, prevent, and mitigate against them!</p><p><br></p><h3>Links: </h3><ul><li><a href="https://www.usenix.org/conference/osdi22/presentation/huang-lexiang" rel="noopener noreferrer" target="_blank">OSDI paper and talk</a></li><li><a href="https://sites.psu.edu/lexiangh/" rel="noopener noreferrer" target="_blank">Personal website</a></li><li>T<a href="https://twitter.com/lex_psu" rel="noopener noreferrer" target="_blank">witter</a></li><li><a href="https://www.linkedin.com/in/lexiang-huang-303382141/" rel="noopener noreferrer" target="_blank">LinkedIn</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary: </h3><p>In this episode Lexiang Huang&nbsp;talks about a framework for understanding a class of failures in distributed systems called metastable failures. Lexiang tells us about his study on the prevalence of such failures in the wild and how he and his colleagues scoured over publicly available incident reports from many organizations, ranging from hyperscalers to small companies. Listen to the episode to find out about his main findings and gain a deeper understanding of metastable failures and how you can identity, prevent, and mitigate against them!</p><p><br></p><h3>Links: </h3><ul><li><a href="https://www.usenix.org/conference/osdi22/presentation/huang-lexiang" rel="noopener noreferrer" target="_blank">OSDI paper and talk</a></li><li><a href="https://sites.psu.edu/lexiangh/" rel="noopener noreferrer" target="_blank">Personal website</a></li><li>T<a href="https://twitter.com/lex_psu" rel="noopener noreferrer" target="_blank">witter</a></li><li><a href="https://www.linkedin.com/in/lexiang-huang-303382141/" rel="noopener noreferrer" target="_blank">LinkedIn</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Andrew Quinn | Debugging the OmniTable Way | #16</title>
			<itunes:title>Andrew Quinn | Debugging the OmniTable Way | #16</itunes:title>
			<pubDate>Mon, 02 Jan 2023 14:00:00 GMT</pubDate>
			<itunes:duration>57:58</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/63b2cbe2432cf00011a53fcf/media.mp3" length="55662720" type="audio/mpeg"/>
			<guid isPermaLink="false">63b2cbe2432cf00011a53fcf</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://arquinn.github.io/assets/pdf/quinn22.pdf</link>
			<acast:episodeId>63b2cbe2432cf00011a53fcf</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>andrew-quinn-debugging-the-omnitable-way</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4JDuXlfPpCBh2Tkht+q9a5wq8Wx3zYdiBn+SzQqjFjciHg/GHkPo77fBE269uo+dJ2kZQNENhtrsMOaAR64Td0k]]></acast:settings>
			<itunes:subtitle><![CDATA[OSDI'22]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>3</itunes:season>
			<itunes:episode>1</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1672661895972-70e0dd4da5c49cfedac21f0c15b68130.jpeg"/>
			<description><![CDATA[<h3>Summary: </h3><p>Debugging is time-consuming, accounting for roughly 50% of a developer's time. In this episode Andrew Quinn tells us about the OmniTable, an abstraction that captures all execution state as a large queryable data table. In his research Andrew has built a query model around an OmniTable that supports SQL to simplify debugging. An OmniTable decouples debugging logic from the original execution, which SteamDrill, Andrew's prototype, uses to reduce the performance overhead of debugging (SteamDrill queries are an order-of-magnitude faster than existing debugging tools). </p><br><p><br></p><h3>Links: </h3><ul><li><a href="https://arquinn.github.io/ " rel="noopener noreferrer" target="_blank">Andrew's Homepage</a></li><li><a href="https://arquinn.github.io/assets/pdf/quinn22.pdf" rel="noopener noreferrer" target="_blank">Debugging the OmniTable Way OSDI'22 Paper</a></li><li><a href="https://github.com/arquinn/SteamDrill" rel="noopener noreferrer" target="_blank">StreamDrill GitHub Repo</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary: </h3><p>Debugging is time-consuming, accounting for roughly 50% of a developer's time. In this episode Andrew Quinn tells us about the OmniTable, an abstraction that captures all execution state as a large queryable data table. In his research Andrew has built a query model around an OmniTable that supports SQL to simplify debugging. An OmniTable decouples debugging logic from the original execution, which SteamDrill, Andrew's prototype, uses to reduce the performance overhead of debugging (SteamDrill queries are an order-of-magnitude faster than existing debugging tools). </p><br><p><br></p><h3>Links: </h3><ul><li><a href="https://arquinn.github.io/ " rel="noopener noreferrer" target="_blank">Andrew's Homepage</a></li><li><a href="https://arquinn.github.io/assets/pdf/quinn22.pdf" rel="noopener noreferrer" target="_blank">Debugging the OmniTable Way OSDI'22 Paper</a></li><li><a href="https://github.com/arquinn/SteamDrill" rel="noopener noreferrer" target="_blank">StreamDrill GitHub Repo</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Audrey Cheng | TAOBench: An End-to-End Benchmark for Social Network Workloads | #15</title>
			<itunes:title>Audrey Cheng | TAOBench: An End-to-End Benchmark for Social Network Workloads | #15</itunes:title>
			<pubDate>Mon, 12 Dec 2022 08:00:04 GMT</pubDate>
			<itunes:duration>52:43</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/638fd14e96d1480011bd3ab6/media.mp3" length="50624640" type="audio/mpeg"/>
			<guid isPermaLink="false">638fd14e96d1480011bd3ab6</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/audrey-cheng-taobench-an-end-to-end-benchmark-for-social-net</link>
			<acast:episodeId>638fd14e96d1480011bd3ab6</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>audrey-cheng-taobench-an-end-to-end-benchmark-for-social-net</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4Jot8Nn/CiVkejx0KzJu1U64LKP7oXIeYxxkKchmj+dbvZwbxlNA/+eEK81vox0IC6MlzANn4uApYzWXkXJDjXx]]></acast:settings>
			<itunes:subtitle><![CDATA[VLDB'22]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>2</itunes:season>
			<itunes:episode>5</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1670370062198-764b10aa73aa28c56ed7220a70593eb1.jpeg"/>
			<description><![CDATA[<p><strong>Summary: </strong>This episode features Audrey Cheng talking about TAOBench, a new benchmark that captures the social graph workload at Meta. Audrey tells us about the features of workload, how it compares with other benchmarks, and how it fills a gap in the existing space of benchmark. Also, we hear all about the fantastic real-world impact the benchmark has already had across a range of companies. </p><br><p><strong>Links:</strong></p><ul><li><a href="https://www.vldb.org/pvldb/vol15/p1965-cheng.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://audreyccheng.com/" rel="noopener noreferrer" target="_blank">Personal website</a></li><li><a href="https://engineering.fb.com/2022/09/07/open-source/taobench/" rel="noopener noreferrer" target="_blank">Meta blog post</a></li><li><a href="https://github.com/audreyccheng/taobench" rel="noopener noreferrer" target="_blank">GitHub repo</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p><strong>Summary: </strong>This episode features Audrey Cheng talking about TAOBench, a new benchmark that captures the social graph workload at Meta. Audrey tells us about the features of workload, how it compares with other benchmarks, and how it fills a gap in the existing space of benchmark. Also, we hear all about the fantastic real-world impact the benchmark has already had across a range of companies. </p><br><p><strong>Links:</strong></p><ul><li><a href="https://www.vldb.org/pvldb/vol15/p1965-cheng.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://audreyccheng.com/" rel="noopener noreferrer" target="_blank">Personal website</a></li><li><a href="https://engineering.fb.com/2022/09/07/open-source/taobench/" rel="noopener noreferrer" target="_blank">Meta blog post</a></li><li><a href="https://github.com/audreyccheng/taobench" rel="noopener noreferrer" target="_blank">GitHub repo</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>George Konstantinidis | Enabling Personal Consent in Databases | #14</title>
			<itunes:title>George Konstantinidis | Enabling Personal Consent in Databases | #14</itunes:title>
			<pubDate>Mon, 05 Dec 2022 08:00:56 GMT</pubDate>
			<itunes:duration>55:36</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/638a1827b03a150010c47289/media.mp3" length="53385344" type="audio/mpeg"/>
			<guid isPermaLink="false">638a1827b03a150010c47289</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/george-konstantinidis-enabling-personal-consent-in-databases</link>
			<acast:episodeId>638a1827b03a150010c47289</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>george-konstantinidis-enabling-personal-consent-in-databases</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4KmRKHmMH4paGxbqjVnPCdMj4TDlxZ4NaAZqH4RO7k34V8MvGkRfJlVbrnb4PipnQROE4oohsCddtoBqsNlCZlu]]></acast:settings>
			<itunes:subtitle><![CDATA[VLDB'22]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>2</itunes:season>
			<itunes:episode>4</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1669993901254-70360e1ba43d8141bebef582dbd30eaf.jpeg"/>
			<description><![CDATA[<h2>Summary: </h2><p>Users have the right to consent to the use of their data, but current methods are limited to very coarse-grained expressions of consent, as “opt-in/opt-out” choices for certain uses. In this episode, George talks about how he and his group identified the need for fine-grained consent management and how they formalized how to express and manage user consent and personal contracts of data usage in relational databases. Their approach enables data owners to express the intended data usage in formal specifications, called consent constraints, and enables a service provider that wants to honor these constraints, to automatically do so by filtering query results that violate consent; rather than both sides relying on “terms of use” agreements written in natural language. He talks about the implementation of their framework in an open source RDBMS, and the evaluation against the most relevant privacy approach using the TPC-H benchmark and a real dataset of ICU data. [Summary adapted from George's VLDB paper] </p><p><br></p><h2>Links: </h2><ul><li><a href="https://www.vldb.org/pvldb/vol15/p375-konstantinidis.pdf" rel="noopener noreferrer" target="_blank">VLDB paper </a></li><li><a href="https://github.com/georgekon/enabling-personal-consent" rel="noopener noreferrer" target="_blank">GitHub repo</a></li><li><a href="https://www.turing.ac.uk/people/researchers/george-konstantinidis" rel="noopener noreferrer" target="_blank">Homepage</a></li><li><a href="https://www.linkedin.com/in/george-konstantinidis/" rel="noopener noreferrer" target="_blank">George's LinkedIn</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h2>Summary: </h2><p>Users have the right to consent to the use of their data, but current methods are limited to very coarse-grained expressions of consent, as “opt-in/opt-out” choices for certain uses. In this episode, George talks about how he and his group identified the need for fine-grained consent management and how they formalized how to express and manage user consent and personal contracts of data usage in relational databases. Their approach enables data owners to express the intended data usage in formal specifications, called consent constraints, and enables a service provider that wants to honor these constraints, to automatically do so by filtering query results that violate consent; rather than both sides relying on “terms of use” agreements written in natural language. He talks about the implementation of their framework in an open source RDBMS, and the evaluation against the most relevant privacy approach using the TPC-H benchmark and a real dataset of ICU data. [Summary adapted from George's VLDB paper] </p><p><br></p><h2>Links: </h2><ul><li><a href="https://www.vldb.org/pvldb/vol15/p375-konstantinidis.pdf" rel="noopener noreferrer" target="_blank">VLDB paper </a></li><li><a href="https://github.com/georgekon/enabling-personal-consent" rel="noopener noreferrer" target="_blank">GitHub repo</a></li><li><a href="https://www.turing.ac.uk/people/researchers/george-konstantinidis" rel="noopener noreferrer" target="_blank">Homepage</a></li><li><a href="https://www.linkedin.com/in/george-konstantinidis/" rel="noopener noreferrer" target="_blank">George's LinkedIn</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Per Fuchs | Sortledton: a Universal, Transactional Graph Data Structure | #13</title>
			<itunes:title>Per Fuchs | Sortledton: a Universal, Transactional Graph Data Structure | #13</itunes:title>
			<pubDate>Mon, 28 Nov 2022 08:00:44 GMT</pubDate>
			<itunes:duration>41:21</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/6383d534dae7b50010d0ba66/media.mp3" length="39712896" type="audio/mpeg"/>
			<guid isPermaLink="false">6383d534dae7b50010d0ba66</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/per-fuchs-sortledton</link>
			<acast:episodeId>6383d534dae7b50010d0ba66</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>per-fuchs-sortledton</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4LKFKkcBtP5cl8sBvVVXnFTIkTHUkWxtfa3cWTFgwhr5DlHUY3yELjVV7S0dxco5YEWUVzRybka7hTKnM/sBZOS]]></acast:settings>
			<itunes:subtitle><![CDATA[VLDB'22]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>2</itunes:season>
			<itunes:episode>3</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1669584148106-26a3a400205233e8a514487672d35990.jpeg"/>
			<description><![CDATA[<h3>Summary (VLDB abstract):</h3><p><br></p><p>Despite the wide adoption of graph processing across many different application domains, there is no underlying data structure that can serve a variety of graph workloads (analytics, traversals, and pattern matching) on dynamic graphs with transactional updates. In this episode, Per talks about Sortledton, a universal graph data structure that addresses the open problem by being carefully optimizing for the most relevant data access patterns used by graph computation kernels. It can support millions of transactional updates per second, while providing competitive performance (1.22x on average) for the most common graph workloads to the best-known baseline for static graphs – csr. With this, we improve the ingestion throughput over state-of-the-art dynamic graph data structures, while supporting a wider range of graph computations under transactional guarantees, with a much simpler design and signifcantly smaller memory footprint (2.1x that of csr).</p><p><br></p><h3>Links:</h3><ul><li><a href="https://www.vldb.org/pvldb/vol15/p1173-fuchs.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://www.linkedin.com/in/perfuchs/?originalSubdomain=de" rel="noopener noreferrer" target="_blank">Per's LinkedIn</a></li><li><a href="https://github.com/PerFuchs/gfe_driver" rel="noopener noreferrer" target="_blank">Graph Framework Evaluation</a></li><li><a href="https://gitlab.db.in.tum.de/per.fuchs/sortledton" rel="noopener noreferrer" target="_blank">Implementation</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary (VLDB abstract):</h3><p><br></p><p>Despite the wide adoption of graph processing across many different application domains, there is no underlying data structure that can serve a variety of graph workloads (analytics, traversals, and pattern matching) on dynamic graphs with transactional updates. In this episode, Per talks about Sortledton, a universal graph data structure that addresses the open problem by being carefully optimizing for the most relevant data access patterns used by graph computation kernels. It can support millions of transactional updates per second, while providing competitive performance (1.22x on average) for the most common graph workloads to the best-known baseline for static graphs – csr. With this, we improve the ingestion throughput over state-of-the-art dynamic graph data structures, while supporting a wider range of graph computations under transactional guarantees, with a much simpler design and signifcantly smaller memory footprint (2.1x that of csr).</p><p><br></p><h3>Links:</h3><ul><li><a href="https://www.vldb.org/pvldb/vol15/p1173-fuchs.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://www.linkedin.com/in/perfuchs/?originalSubdomain=de" rel="noopener noreferrer" target="_blank">Per's LinkedIn</a></li><li><a href="https://github.com/PerFuchs/gfe_driver" rel="noopener noreferrer" target="_blank">Graph Framework Evaluation</a></li><li><a href="https://gitlab.db.in.tum.de/per.fuchs/sortledton" rel="noopener noreferrer" target="_blank">Implementation</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>George Theodorakis | Scabbard: Single-Node Fault-Tolerant Stream Processing | #12</title>
			<itunes:title>George Theodorakis | Scabbard: Single-Node Fault-Tolerant Stream Processing | #12</itunes:title>
			<pubDate>Mon, 21 Nov 2022 08:46:04 GMT</pubDate>
			<itunes:duration>45:36</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/63793a7e9b896a0011ce16aa/media.mp3" length="43786368" type="audio/mpeg"/>
			<guid isPermaLink="false">63793a7e9b896a0011ce16aa</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/george-theodorakis-scabbard</link>
			<acast:episodeId>63793a7e9b896a0011ce16aa</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>george-theodorakis-scabbard</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4I/ayx62E0p0efTyjT9p3rTLwdrVMMiHF9VRtiPj80qipIs677H8I0PsZk0TWh4re4nkcxJdgtnYmdbbwughab5]]></acast:settings>
			<itunes:subtitle><![CDATA[VLDB'22]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>2</itunes:season>
			<itunes:episode>2</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1668888912578-da08e0fc258b43bb596dc9cf6bb308a3.jpeg"/>
			<description><![CDATA[<h3>Summary (VLDB abstract):</h3><p>Single-node multi-core stream processing engines (SPEs) can process hundreds of millions of tuples per second. Yet making them fault-tolerant with exactly-once semantics while retaining this performance is an open challenge: due to the limited I/O bandwidth of a single-node, it becomes infeasible to persist all stream data and operator state during execution. Instead, single-node SPEs rely on upstream distributed systems, such as Apache Kafka, to recover stream data after failure, necessitating complex clusterbased deployments. This lack of built-in fault-tolerance features has hindered the adoption of single-node SPEs. We describe Scabbard, the frst single-node SPE that supports exactly-once fault-tolerance semantics despite limited local I/O bandwidth. Scabbard achieves this by integrating persistence operations with the query workload. Within the operator graph, Scabbard determines when to persist streams based on the selectivity of operators: by persisting streams after operators that discard data, it can substantially reduce the required I/O bandwidth. As part of the operator graph, Scabbard supports parallel persistence operations and uses markers to decide when to discard persisted data. The persisted data volume is further reduced using workload-specifc compression: Scabbard monitors stream statistics and dynamically generates computationally efcient compression operators. Our experiments show that Scabbard can execute stream queries that process over 200 million tuples per second while recovering from failures with sub-second latencies.</p><p><br></p><h3>Questions:</h3><ul><li>Can start off by explaining what stream processing is and its common use cases?&nbsp;&nbsp;</li><li>How did you end up researching in this area?&nbsp;</li><li>What is Scabbard?&nbsp;</li><li>Can you explain the differences between single-node and distributed SPEs?&nbsp;</li><li>What are the advantages of single-node SPEs?&nbsp;</li><li>What are the pitfalls that have limited single-node SPEs adoption?</li><li>What were your design goals when developing Scabbard?</li><li>What is the key idea underpinning Scabbard?</li><li>In the paper you state there are 3 main contributions in Scabbard can you talk us through each one;</li><li>How did you implement Scabbard? Give an overview of architecture?</li><li>What was your approach to evaluating Scabbard? What were the questions you were trying to answer?</li><li>What did you compare Scabbard against? What was the experimental set up?</li><li>What were the key results?</li><li>Are there any situations when Scabbard’s performance is sub-optimal? What are the limitations?&nbsp;</li><li>Is Scabbard publicly available?&nbsp;&nbsp;</li><li>As a software developer how do I interact with Scabbard? </li><li>What are the most interesting and perhaps unexpected lessons that you have learned while working on Scabbard?</li><li>Progress in research is non-linear, from the conception of the idea for Scabbard to the publication, were there things you tried that failed?&nbsp;</li><li>What do you have planned for future research with Scabbard?</li><li>Can you tell the listeners about your other research?&nbsp;&nbsp;</li><li>How do you approach idea generation and selecting projects?&nbsp;</li><li>What do you think is the biggest challenge in your research area now?&nbsp;</li><li>What’s the one key thing you want listeners to take away from your research?</li></ul><p><br></p><h3>Links:</h3><ul><li><a href="https://vldb.org/pvldb/vol15/p361-theodorakis.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://github.com/lsds/LightSaber" rel="noopener noreferrer" target="_blank">GitHub</a></li><li><a href="https://grtheod.github.io/" rel="noopener noreferrer" target="_blank">George's homepage</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary (VLDB abstract):</h3><p>Single-node multi-core stream processing engines (SPEs) can process hundreds of millions of tuples per second. Yet making them fault-tolerant with exactly-once semantics while retaining this performance is an open challenge: due to the limited I/O bandwidth of a single-node, it becomes infeasible to persist all stream data and operator state during execution. Instead, single-node SPEs rely on upstream distributed systems, such as Apache Kafka, to recover stream data after failure, necessitating complex clusterbased deployments. This lack of built-in fault-tolerance features has hindered the adoption of single-node SPEs. We describe Scabbard, the frst single-node SPE that supports exactly-once fault-tolerance semantics despite limited local I/O bandwidth. Scabbard achieves this by integrating persistence operations with the query workload. Within the operator graph, Scabbard determines when to persist streams based on the selectivity of operators: by persisting streams after operators that discard data, it can substantially reduce the required I/O bandwidth. As part of the operator graph, Scabbard supports parallel persistence operations and uses markers to decide when to discard persisted data. The persisted data volume is further reduced using workload-specifc compression: Scabbard monitors stream statistics and dynamically generates computationally efcient compression operators. Our experiments show that Scabbard can execute stream queries that process over 200 million tuples per second while recovering from failures with sub-second latencies.</p><p><br></p><h3>Questions:</h3><ul><li>Can start off by explaining what stream processing is and its common use cases?&nbsp;&nbsp;</li><li>How did you end up researching in this area?&nbsp;</li><li>What is Scabbard?&nbsp;</li><li>Can you explain the differences between single-node and distributed SPEs?&nbsp;</li><li>What are the advantages of single-node SPEs?&nbsp;</li><li>What are the pitfalls that have limited single-node SPEs adoption?</li><li>What were your design goals when developing Scabbard?</li><li>What is the key idea underpinning Scabbard?</li><li>In the paper you state there are 3 main contributions in Scabbard can you talk us through each one;</li><li>How did you implement Scabbard? Give an overview of architecture?</li><li>What was your approach to evaluating Scabbard? What were the questions you were trying to answer?</li><li>What did you compare Scabbard against? What was the experimental set up?</li><li>What were the key results?</li><li>Are there any situations when Scabbard’s performance is sub-optimal? What are the limitations?&nbsp;</li><li>Is Scabbard publicly available?&nbsp;&nbsp;</li><li>As a software developer how do I interact with Scabbard? </li><li>What are the most interesting and perhaps unexpected lessons that you have learned while working on Scabbard?</li><li>Progress in research is non-linear, from the conception of the idea for Scabbard to the publication, were there things you tried that failed?&nbsp;</li><li>What do you have planned for future research with Scabbard?</li><li>Can you tell the listeners about your other research?&nbsp;&nbsp;</li><li>How do you approach idea generation and selecting projects?&nbsp;</li><li>What do you think is the biggest challenge in your research area now?&nbsp;</li><li>What’s the one key thing you want listeners to take away from your research?</li></ul><p><br></p><h3>Links:</h3><ul><li><a href="https://vldb.org/pvldb/vol15/p361-theodorakis.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://github.com/lsds/LightSaber" rel="noopener noreferrer" target="_blank">GitHub</a></li><li><a href="https://grtheod.github.io/" rel="noopener noreferrer" target="_blank">George's homepage</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Kevin Gaffney | SQLite: Past, Present, and Future | #11</title>
			<itunes:title>Kevin Gaffney | SQLite: Past, Present, and Future | #11</itunes:title>
			<pubDate>Mon, 14 Nov 2022 08:15:48 GMT</pubDate>
			<itunes:duration>48:18</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/63702a21cc535100112c5c17/media.mp3" length="46375040" type="audio/mpeg"/>
			<guid isPermaLink="false">63702a21cc535100112c5c17</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/kevin-gaffney-sqlite-past-present-and-future</link>
			<acast:episodeId>63702a21cc535100112c5c17</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>kevin-gaffney-sqlite-past-present-and-future</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4KvAQkOJqP+zSX7k0JEB4Gz+ROJhma9gkCYIJi2GBf0M6cbVX07fzKBmqKH7T5VpWPITly3uxmZR+Fc5uLH19Ts]]></acast:settings>
			<itunes:subtitle><![CDATA[VLDB'22]]></itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>2</itunes:season>
			<itunes:episode>1</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1668294450178-f760380b7188ebed663a2fff8daa1909.jpeg"/>
			<description><![CDATA[<h3>Summary: </h3><p>In this episode Kevin Gaffney tells us about SQLite, the most widely deployed database engine in existence. SQLite is found in nearly every smartphone, computer, web browser, television, and automobile. Several factors are likely responsible for its ubiquity, including its in-process design, standalone codebase, extensive test suite, and cross-platform file format. While it supports complex analytical queries, SQLite is primarily designed for fast online transaction processing (OLTP), employing row-oriented execution and a B-tree storage format. However, fueled by the rise of edge computing and data science, there is a growing need for efficient in-process online analytical processing (OLAP). DuckDB, a database engine nicknamed “the SQLite for analytics”, has recently emerged to meet this demand. While DuckDB has shown strong performance on OLAP benchmarks, it is unclear how SQLite compares... Listen to the podcast to find out more about Kevin's work on identifying key bottlenecks in OLAP workloads and the optimizations he has helped develop.</p><p><br></p><h3>Questions: </h3><ul><li>How did you end up researching databases?&nbsp;</li><li>Can you describe what SQLite is?&nbsp;</li><li>Can you give the listener an overview of SQLite’s architecture?&nbsp;</li><li>How does SQLite provide ACID guarantees?&nbsp;</li><li>How has hardware and workload changed across SQLite’s life?&nbsp;</li><li>What challenges do these changes pose for SQLite?</li><li>In your paper you subject SQLite to an extensive performance evaluation, what were the questions you were trying to answer?&nbsp;</li><li>What was the experimental set up? What benchmarks did you use?</li><li>How realistic are these workloads? How closely do these map to user studies?&nbsp;</li><li>What were the key results in your OLTP experiments?</li><li>You mentioned that delete performance was poor in the user study, did you observe why in the OLTP experiment?</li><li>Can you talk us through your OLAP experiment?</li><li>What were the key analytical data processing bottlenecks you found in SQLite?</li><li>What were your optimizations? How did they perform?&nbsp;</li><li>What are the reasons for SQLite using dynamic programming?</li><li>Are your optimizations available in SQLite today?&nbsp;</li><li>What were the findings in your blob I/O experiment?&nbsp;</li><li>Progress in research is non-linear, from the conception of the idea for your paper to the publication, were there things you tried that failed?&nbsp;</li><li>What do you have planned for future research?&nbsp;</li><li>How do you think SQLite will evolve over the coming years?&nbsp;</li><li>Can you tell the listeners about your other research?</li><li>What do you think is the biggest challenge in your research area now?&nbsp;</li><li>What’s the one key thing you want listeners to take away from your research?</li></ul><p><br></p><h3>Links: </h3><ul><li><a href="https://www.vldb.org/pvldb/vol15/p3535-gaffney.pdf" rel="noopener noreferrer" target="_blank">SQLite: Past, Present, and Future</a></li><li><a href="http://www.vldb.org/pvldb/vol14/p1467-gaffney.pdf" rel="noopener noreferrer" target="_blank">Database Isolation By Scheduling</a></li><li><a href="https://www.linkedin.com/in/kpgaffney/" rel="noopener noreferrer" target="_blank">Kevin's LinkedIn</a></li><li><a href="https://www.sqlite.org/index.html" rel="noopener noreferrer" target="_blank">SQLite Homepage</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary: </h3><p>In this episode Kevin Gaffney tells us about SQLite, the most widely deployed database engine in existence. SQLite is found in nearly every smartphone, computer, web browser, television, and automobile. Several factors are likely responsible for its ubiquity, including its in-process design, standalone codebase, extensive test suite, and cross-platform file format. While it supports complex analytical queries, SQLite is primarily designed for fast online transaction processing (OLTP), employing row-oriented execution and a B-tree storage format. However, fueled by the rise of edge computing and data science, there is a growing need for efficient in-process online analytical processing (OLAP). DuckDB, a database engine nicknamed “the SQLite for analytics”, has recently emerged to meet this demand. While DuckDB has shown strong performance on OLAP benchmarks, it is unclear how SQLite compares... Listen to the podcast to find out more about Kevin's work on identifying key bottlenecks in OLAP workloads and the optimizations he has helped develop.</p><p><br></p><h3>Questions: </h3><ul><li>How did you end up researching databases?&nbsp;</li><li>Can you describe what SQLite is?&nbsp;</li><li>Can you give the listener an overview of SQLite’s architecture?&nbsp;</li><li>How does SQLite provide ACID guarantees?&nbsp;</li><li>How has hardware and workload changed across SQLite’s life?&nbsp;</li><li>What challenges do these changes pose for SQLite?</li><li>In your paper you subject SQLite to an extensive performance evaluation, what were the questions you were trying to answer?&nbsp;</li><li>What was the experimental set up? What benchmarks did you use?</li><li>How realistic are these workloads? How closely do these map to user studies?&nbsp;</li><li>What were the key results in your OLTP experiments?</li><li>You mentioned that delete performance was poor in the user study, did you observe why in the OLTP experiment?</li><li>Can you talk us through your OLAP experiment?</li><li>What were the key analytical data processing bottlenecks you found in SQLite?</li><li>What were your optimizations? How did they perform?&nbsp;</li><li>What are the reasons for SQLite using dynamic programming?</li><li>Are your optimizations available in SQLite today?&nbsp;</li><li>What were the findings in your blob I/O experiment?&nbsp;</li><li>Progress in research is non-linear, from the conception of the idea for your paper to the publication, were there things you tried that failed?&nbsp;</li><li>What do you have planned for future research?&nbsp;</li><li>How do you think SQLite will evolve over the coming years?&nbsp;</li><li>Can you tell the listeners about your other research?</li><li>What do you think is the biggest challenge in your research area now?&nbsp;</li><li>What’s the one key thing you want listeners to take away from your research?</li></ul><p><br></p><h3>Links: </h3><ul><li><a href="https://www.vldb.org/pvldb/vol15/p3535-gaffney.pdf" rel="noopener noreferrer" target="_blank">SQLite: Past, Present, and Future</a></li><li><a href="http://www.vldb.org/pvldb/vol14/p1467-gaffney.pdf" rel="noopener noreferrer" target="_blank">Database Isolation By Scheduling</a></li><li><a href="https://www.linkedin.com/in/kpgaffney/" rel="noopener noreferrer" target="_blank">Kevin's LinkedIn</a></li><li><a href="https://www.sqlite.org/index.html" rel="noopener noreferrer" target="_blank">SQLite Homepage</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Matthias Jasny | P4DB - The Case for In-Network OLTP | #10</title>
			<itunes:title>Matthias Jasny | P4DB - The Case for In-Network OLTP | #10</itunes:title>
			<pubDate>Mon, 08 Aug 2022 08:00:11 GMT</pubDate>
			<itunes:duration>27:20</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/62f025503bb029001330eba5/media.mp3" length="26255488" type="audio/mpeg"/>
			<guid isPermaLink="false">62f025503bb029001330eba5</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://dl.acm.org/doi/pdf/10.1145/3514221.3517825</link>
			<acast:episodeId>62f025503bb029001330eba5</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>matthias-jasny-p4db</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4LINBgerQU1Xnx/GzVYzepJkpHY1Cx6Npb1qE07JmgZEHURtY6i9rdcSoSLlLYd1YbcJmL+4HDRYdPyglEsj16r]]></acast:settings>
			<itunes:subtitle>ACM SIGMOD/PODS 2022</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>1</itunes:season>
			<itunes:episode>10</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1659903539909-9ef4956b728f1c0d56b8d19e6b7dfb47.jpeg"/>
			<description><![CDATA[<h3>Summary: </h3><p>In this episode Matthias Jasny from TU Darmstadt talks about P4DB, a database that uses a programmable switch to accelerate OLTP workloads. The main idea of P4DB is that it implements a transaction processing engine on top of a P4-programmable switch. The switch can thus act as an accelerator in the network, especially when it is used to store and process hot (contended) tuples on the switch. P4DB provides significant benefits compared to traditional DBMS architectures and can achieve a speedup of up to 8x.</p><p><br></p><h3>Questions: </h3><p>0:55: Can you set the scene for your research and describe the motivation behind P4DB?&nbsp;</p><p>1:42: Can you describe to listeners who may not be familiar with them, what exactly is a programmable switch?&nbsp;</p><p>3:55: What are the characteristics of OLTP workloads that make them a good fit for programmable switches?</p><p>5:33: Can you elaborate on the key idea of P4DB?</p><p>6:46: How do you go about mapping the execution of transactions to the architecture of a programmable switch?</p><p>10:13: Can you walk us through the lifecycle of a switch transaction?</p><p>11:04: How does P4DB determine the optimal tuple placement on the switch?</p><p>12:16: Is this allocation static or is it dynamic, can the tuple order be changed at runtime?</p><p>12:55:&nbsp; What happens if a transaction needs to access tuples in a different order then that laid out on the switch?&nbsp;</p><p>14:11: Obviously you can’t fit all data on the switch, only the hot data, how does P4DB execute transactions that access some hot and some cold data that’s not on the switch?</p><p>16:04: How did you evaluate P4DB? What are the results?&nbsp;&nbsp;</p><p>18:28: What was the magnitude of the speed up in the scenarios in which P4DB showed performance gains? </p><p>19:29: Are there any situations in which P4DB performs non-optimally and what are the workload characteristics of these situations?</p><p>20:36: How many tuples can you get on a switch?&nbsp;</p><p>21:23: Where do you see your results being useful? Who will find them the most relevant?&nbsp;</p><p>21:57: Across your time working on P4DB, what are the most interesting, perhaps unexpected,&nbsp; lessons that you learned?&nbsp;</p><p>22:39: That leads me into my next question, what were the things you tried while working on P4DB that failed? Can you give any words of advice to people who might work with programmable switches in the future?&nbsp;</p><p>23:24: What do you have planned for future research?&nbsp;</p><p>24:24: Is P4DB publically available? </p><p>24:53: What attracted you to this research area?</p><p>25:42: What’s the one key thing you want listeners to take away from your research and your work on P4DB? </p><h3><br></h3><h3> Links: </h3><ul><li><a href="https://dl.acm.org/doi/pdf/10.1145/3514221.3517825" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://www.youtube.com/watch?v=o4_PCUypm4Y" rel="noopener noreferrer" target="_blank">Presentation</a></li><li><a href="https://www.informatik.tu-darmstadt.de/systems/systems_tuda/group/team_detail_111232.en.jsp" rel="noopener noreferrer" target="_blank">Website</a></li><li><a href="matthias.jasny@cs.tu-darmstadt.de" rel="noopener noreferrer" target="_blank">Email</a></li><li><a href="https://scholar.google.com/citations?user=S5U7eEEAAAAJ&amp;hl=en " rel="noopener noreferrer" target="_blank">Google Scholar</a></li><li><a href="https://github.com/DataManagementLab/p4db" rel="noopener noreferrer" target="_blank">P4DB</a></li></ul><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary: </h3><p>In this episode Matthias Jasny from TU Darmstadt talks about P4DB, a database that uses a programmable switch to accelerate OLTP workloads. The main idea of P4DB is that it implements a transaction processing engine on top of a P4-programmable switch. The switch can thus act as an accelerator in the network, especially when it is used to store and process hot (contended) tuples on the switch. P4DB provides significant benefits compared to traditional DBMS architectures and can achieve a speedup of up to 8x.</p><p><br></p><h3>Questions: </h3><p>0:55: Can you set the scene for your research and describe the motivation behind P4DB?&nbsp;</p><p>1:42: Can you describe to listeners who may not be familiar with them, what exactly is a programmable switch?&nbsp;</p><p>3:55: What are the characteristics of OLTP workloads that make them a good fit for programmable switches?</p><p>5:33: Can you elaborate on the key idea of P4DB?</p><p>6:46: How do you go about mapping the execution of transactions to the architecture of a programmable switch?</p><p>10:13: Can you walk us through the lifecycle of a switch transaction?</p><p>11:04: How does P4DB determine the optimal tuple placement on the switch?</p><p>12:16: Is this allocation static or is it dynamic, can the tuple order be changed at runtime?</p><p>12:55:&nbsp; What happens if a transaction needs to access tuples in a different order then that laid out on the switch?&nbsp;</p><p>14:11: Obviously you can’t fit all data on the switch, only the hot data, how does P4DB execute transactions that access some hot and some cold data that’s not on the switch?</p><p>16:04: How did you evaluate P4DB? What are the results?&nbsp;&nbsp;</p><p>18:28: What was the magnitude of the speed up in the scenarios in which P4DB showed performance gains? </p><p>19:29: Are there any situations in which P4DB performs non-optimally and what are the workload characteristics of these situations?</p><p>20:36: How many tuples can you get on a switch?&nbsp;</p><p>21:23: Where do you see your results being useful? Who will find them the most relevant?&nbsp;</p><p>21:57: Across your time working on P4DB, what are the most interesting, perhaps unexpected,&nbsp; lessons that you learned?&nbsp;</p><p>22:39: That leads me into my next question, what were the things you tried while working on P4DB that failed? Can you give any words of advice to people who might work with programmable switches in the future?&nbsp;</p><p>23:24: What do you have planned for future research?&nbsp;</p><p>24:24: Is P4DB publically available? </p><p>24:53: What attracted you to this research area?</p><p>25:42: What’s the one key thing you want listeners to take away from your research and your work on P4DB? </p><h3><br></h3><h3> Links: </h3><ul><li><a href="https://dl.acm.org/doi/pdf/10.1145/3514221.3517825" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://www.youtube.com/watch?v=o4_PCUypm4Y" rel="noopener noreferrer" target="_blank">Presentation</a></li><li><a href="https://www.informatik.tu-darmstadt.de/systems/systems_tuda/group/team_detail_111232.en.jsp" rel="noopener noreferrer" target="_blank">Website</a></li><li><a href="matthias.jasny@cs.tu-darmstadt.de" rel="noopener noreferrer" target="_blank">Email</a></li><li><a href="https://scholar.google.com/citations?user=S5U7eEEAAAAJ&amp;hl=en " rel="noopener noreferrer" target="_blank">Google Scholar</a></li><li><a href="https://github.com/DataManagementLab/p4db" rel="noopener noreferrer" target="_blank">P4DB</a></li></ul><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Tobias Ziegler | ScaleStore: A Fast and Cost-Efficient Storage Engine using DRAM, NVMe, and RDMA | #9</title>
			<itunes:title>Tobias Ziegler | ScaleStore: A Fast and Cost-Efficient Storage Engine using DRAM, NVMe, and RDMA | #9</itunes:title>
			<pubDate>Mon, 01 Aug 2022 08:00:43 GMT</pubDate>
			<itunes:duration>23:08</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/62e6682c7d3d0f001257b3f6/media.mp3" length="22210688" type="audio/mpeg"/>
			<guid isPermaLink="false">62e6682c7d3d0f001257b3f6</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://dl.acm.org/doi/10.1145/3514221.3526187</link>
			<acast:episodeId>62e6682c7d3d0f001257b3f6</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>tobias-ziegler-scalestore</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4Kq4lJLNgSL6slnmYovRANXCkSyiNs9qSOg1J5/ys5MNN1EDnxXXtObiyMzAeYFVdsMsOnD/lxPWV/DPlKODfwo]]></acast:settings>
			<itunes:subtitle>ACM SIGMOD/PODS 2022</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>1</itunes:season>
			<itunes:episode>9</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1659265243734-5cc0a0917397346cba75a10f9372bd8a.jpeg"/>
			<description><![CDATA[<h2>Summary: </h2><p>In this episode Tobias talks about his work on ScaleStore, a distributed storage engine that exploits DRAM caching, NVMe storage, and RDMA networking to achieve high performance, cost-efficiency, and scalability.&nbsp;</p><br><p>Using low latency RDMA messages, ScaleStore implements a transparent memory abstraction that provides access to the aggregated DRAM memory and NVMe storage of all nodes. In contrast to existing distributed RDMA designs such as NAM-DB or FaRM, ScaleStore stores cold data on NVMe SSDs (flash), lowering the overall hardware cost significantly.&nbsp;</p><br><p>At the heart of ScaleStore is a distributed caching strategy that dynamically decides which data to keep in memory (and which on SSDs) based on the workload. Tobias also talks about how the caching protocol provides strong consistency in the presence of concurrent data modifications. </p><p><br></p><h2>Questions: </h2><p>0:56: What is ScaleStore?&nbsp;</p><p>2:43: Can you elaborate on how ScaleStore solves the problems you just mentioned? And talk more about its caching protocol?</p><p>3:59: How does ScaleStore handle these concurrent updates, where two people want to update the same page?</p><p>5:16: Cool, so how does anticipatory chaining work and did you consider any other ways of dealing with concurrent updates to hot pages?</p><p>7:13: So over time pages get cached, the workload may change, and the DRAM buffers fill up. How does ScaleStore handle cache eviction?&nbsp;</p><p>8:57: As a user, how do I interact with ScaleStore?</p><p>10:19: How did you evaluate ScaleStore? What did you compare it against? What were the key results?&nbsp;</p><p>12:31: You said that ScaleStore is pretty unique in that there is no other system quite like it, but are there any situations in which it performs poorly or is maybe the wrong choice?</p><p>14:09: Where do you see this research having the biggest impact? Who will find ScaleStore useful, who are the results most relevant for?&nbsp;</p><p>15:23: What are the most interesting or maybe unexpected lessons that you have learned while building ScaleStore?</p><p>16:55: Progress in research is sort of non-linear, so from the conception of the idea to the end, where there things you tried that failed? What were the dead ends you ran into that others could benefit from knowing about so they don’t make the same mistakes?&nbsp;&nbsp;</p><p>18:19: What do you have planned for future research?</p><p>20:01: What attracted you to this research area? What do you think is the biggest challenge in this area now?&nbsp;</p><p>20:21: If the network is no longer the bottleneck, what is the new bottleneck?</p><p>22:15: The last word now: what’s the one key thing you want listeners to take away from your research?</p><p><br></p><h2>Links: </h2><p><a href="https://dl.acm.org/doi/10.1145/3514221.3526187" rel="noopener noreferrer" target="_blank">SIGMOD Paper</a></p><p><a href="https://www.youtube.com/watch?v=-R_4kz8VemE" rel="noopener noreferrer" target="_blank">SIGMOD Presentation</a></p><p><a href="https://www.informatik.tu-darmstadt.de/systems/systems_tuda/group/team_detail_18944.en.jsp " rel="noopener noreferrer" target="_blank">Website</a></p><p><a href="tobias.ziegler@cs.tu-darmstadt.de " rel="noopener noreferrer" target="_blank">Email</a> &nbsp;</p><p><a href="https://twitter.com/tobiasziegler18" rel="noopener noreferrer" target="_blank">Twitter</a></p><p><a href="https://scholar.google.com/citations?user=qJ_bkjcAAAAJ&amp;hl=en&amp;oi=ao" rel="noopener noreferrer" target="_blank">Google Scholar</a></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h2>Summary: </h2><p>In this episode Tobias talks about his work on ScaleStore, a distributed storage engine that exploits DRAM caching, NVMe storage, and RDMA networking to achieve high performance, cost-efficiency, and scalability.&nbsp;</p><br><p>Using low latency RDMA messages, ScaleStore implements a transparent memory abstraction that provides access to the aggregated DRAM memory and NVMe storage of all nodes. In contrast to existing distributed RDMA designs such as NAM-DB or FaRM, ScaleStore stores cold data on NVMe SSDs (flash), lowering the overall hardware cost significantly.&nbsp;</p><br><p>At the heart of ScaleStore is a distributed caching strategy that dynamically decides which data to keep in memory (and which on SSDs) based on the workload. Tobias also talks about how the caching protocol provides strong consistency in the presence of concurrent data modifications. </p><p><br></p><h2>Questions: </h2><p>0:56: What is ScaleStore?&nbsp;</p><p>2:43: Can you elaborate on how ScaleStore solves the problems you just mentioned? And talk more about its caching protocol?</p><p>3:59: How does ScaleStore handle these concurrent updates, where two people want to update the same page?</p><p>5:16: Cool, so how does anticipatory chaining work and did you consider any other ways of dealing with concurrent updates to hot pages?</p><p>7:13: So over time pages get cached, the workload may change, and the DRAM buffers fill up. How does ScaleStore handle cache eviction?&nbsp;</p><p>8:57: As a user, how do I interact with ScaleStore?</p><p>10:19: How did you evaluate ScaleStore? What did you compare it against? What were the key results?&nbsp;</p><p>12:31: You said that ScaleStore is pretty unique in that there is no other system quite like it, but are there any situations in which it performs poorly or is maybe the wrong choice?</p><p>14:09: Where do you see this research having the biggest impact? Who will find ScaleStore useful, who are the results most relevant for?&nbsp;</p><p>15:23: What are the most interesting or maybe unexpected lessons that you have learned while building ScaleStore?</p><p>16:55: Progress in research is sort of non-linear, so from the conception of the idea to the end, where there things you tried that failed? What were the dead ends you ran into that others could benefit from knowing about so they don’t make the same mistakes?&nbsp;&nbsp;</p><p>18:19: What do you have planned for future research?</p><p>20:01: What attracted you to this research area? What do you think is the biggest challenge in this area now?&nbsp;</p><p>20:21: If the network is no longer the bottleneck, what is the new bottleneck?</p><p>22:15: The last word now: what’s the one key thing you want listeners to take away from your research?</p><p><br></p><h2>Links: </h2><p><a href="https://dl.acm.org/doi/10.1145/3514221.3526187" rel="noopener noreferrer" target="_blank">SIGMOD Paper</a></p><p><a href="https://www.youtube.com/watch?v=-R_4kz8VemE" rel="noopener noreferrer" target="_blank">SIGMOD Presentation</a></p><p><a href="https://www.informatik.tu-darmstadt.de/systems/systems_tuda/group/team_detail_18944.en.jsp " rel="noopener noreferrer" target="_blank">Website</a></p><p><a href="tobias.ziegler@cs.tu-darmstadt.de " rel="noopener noreferrer" target="_blank">Email</a> &nbsp;</p><p><a href="https://twitter.com/tobiasziegler18" rel="noopener noreferrer" target="_blank">Twitter</a></p><p><a href="https://scholar.google.com/citations?user=qJ_bkjcAAAAJ&amp;hl=en&amp;oi=ao" rel="noopener noreferrer" target="_blank">Google Scholar</a></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Chuzhe Tang | Ad Hoc Transactions in Web Applications: The Good, the Bad, and the Ugly | #8</title>
			<itunes:title>Chuzhe Tang | Ad Hoc Transactions in Web Applications: The Good, the Bad, and the Ugly | #8</itunes:title>
			<pubDate>Mon, 25 Jul 2022 08:00:00 GMT</pubDate>
			<itunes:duration>32:15</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/62dd6f85cbc5b700128dbbd5/media.mp3" length="30969984" type="audio/mpeg"/>
			<guid isPermaLink="false">62dd6f85cbc5b700128dbbd5</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://dl.acm.org/doi/10.1145/3514221.3526120</link>
			<acast:episodeId>62dd6f85cbc5b700128dbbd5</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>ad-hoc-transactions</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4K8bhw5ShfnSh4zdi5qPbYNsYqu/oBZEeDHMC3agEJ//v2g6eklFeePHQMyDJzdw5vEQYzcFCaty6DSVe2SYOdP]]></acast:settings>
			<itunes:subtitle>ACM SIGMOD/PODS 2022</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>1</itunes:season>
			<itunes:episode>8</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1658679141382-87bc509a72b240155fa216f6069e1879.jpeg"/>
			<description><![CDATA[<h1>Summary: </h1><p>Many transactions in web applications are constructed ad-hoc in the application code. For example, developers might explicitly use locking primitives or validation procedures to coordinate critical code fragments. In this episode, Chuzhe tells us these <strong>ad-hoc transactions, </strong>database operations coordinated by application code.</p><br><p>Until Chuzhe’s work, little was known about them. In this episode he chats about the first comprehensive study on ad hoc transactions. By studying 91 ad hoc transactions among 8 popular open-source web applications, he and his co-authors found that (i) every studied application uses ad hoc transactions (up to 16 per application), 71 of which play critical roles; (ii) compared with database transactions, concurrency control of ad hoc transactions is much more flexible; (iii) ad hoc transactions are error-prone-53 of them have correctness issues, and 33 of them were confirmed by developers; and (iv) ad hoc transactions have the potential to improve performance in contentious workloads by utilizing application semantics such as access patterns.&nbsp;</p><br><p>During the interview he discusses the implications of ad hoc transactions to the database research community.</p><p><br></p><h1>Questions: </h1><p>0.58: What is concurrency control and why is it important for web applications?</p><p>3:00: How do applications today use concurrency control? Do they use classical database transactions? Or do they use other approaches?</p><p>4:09: How are these ad-hoc transactions used in practice? What was the primary focus of this paper?</p><p>5:13: You mentioned you studied various open-source applications to investigate ad-hoc transactions, which applications did you look at?</p><p>6:16: So what did you find when studying these different web applications? What do these ad-hoc transactions look like in the wild? Can you elaborate on how they differ&nbsp;</p><p>8:59: When you compared ad-hoc transactions vs classic transactions? Are comparing potentially incorrect ad-hoc transactions vs correct transactions, if so are performance gains just not accepting it might be potentially incorrect at some point?</p><p>10:25: We’ve spoken about how ad-hoc transactions were incorrect. Can we talk about the root cause of this, what were the common mistakes people were making with ad-hoc transactions?</p><p>12:16: What was the performance gain of ad-hoc transactions?</p><p>15:47: Are there other studies of transactions in the wild? If so, how do their findings compare to yours?</p><p>18:38: What does all this mean in practice? Why don’t people just use database transactions? What puts people off using them and thinking I’ll just roll my own?</p><p>21:10: Where do you see your findings having the biggest impact?</p><p>24:42: What do you have planned for future research?</p><p>26:46: What was the most interesting or perhaps unexpected lesson you learnt whilst working on ad-hoc transactions?</p><p>29:13: What attracted you to database concurrency control research?</p><p>30:53: What is the one key thing the listener should take away from your research?</p><p><br></p><h1>Links: </h1><p><a href="https://www.youtube.com/watch?v=WPBM3UCmZ38" rel="noopener noreferrer" target="_blank">Presentation</a></p><p><a href="https://dl.acm.org/doi/10.1145/3514221.3526120" rel="noopener noreferrer" target="_blank">Paper</a></p><p><a href="https://ipads.se.sjtu.edu.cn/pub/members/chuzhe_tang" rel="noopener noreferrer" target="_blank">Chuzhe's Website</a></p><p><a href="http://www.bailis.org/papers/feral-sigmod2015.pdf" rel="noopener noreferrer" target="_blank">Feral Concurrency Control</a></p><p><a href="https://youtu.be/M2MEcvMHzkY?t=3250" rel="noopener noreferrer" target="_blank">What are we doing with our lives? Nobody cares about our concurrency control research </a></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h1>Summary: </h1><p>Many transactions in web applications are constructed ad-hoc in the application code. For example, developers might explicitly use locking primitives or validation procedures to coordinate critical code fragments. In this episode, Chuzhe tells us these <strong>ad-hoc transactions, </strong>database operations coordinated by application code.</p><br><p>Until Chuzhe’s work, little was known about them. In this episode he chats about the first comprehensive study on ad hoc transactions. By studying 91 ad hoc transactions among 8 popular open-source web applications, he and his co-authors found that (i) every studied application uses ad hoc transactions (up to 16 per application), 71 of which play critical roles; (ii) compared with database transactions, concurrency control of ad hoc transactions is much more flexible; (iii) ad hoc transactions are error-prone-53 of them have correctness issues, and 33 of them were confirmed by developers; and (iv) ad hoc transactions have the potential to improve performance in contentious workloads by utilizing application semantics such as access patterns.&nbsp;</p><br><p>During the interview he discusses the implications of ad hoc transactions to the database research community.</p><p><br></p><h1>Questions: </h1><p>0.58: What is concurrency control and why is it important for web applications?</p><p>3:00: How do applications today use concurrency control? Do they use classical database transactions? Or do they use other approaches?</p><p>4:09: How are these ad-hoc transactions used in practice? What was the primary focus of this paper?</p><p>5:13: You mentioned you studied various open-source applications to investigate ad-hoc transactions, which applications did you look at?</p><p>6:16: So what did you find when studying these different web applications? What do these ad-hoc transactions look like in the wild? Can you elaborate on how they differ&nbsp;</p><p>8:59: When you compared ad-hoc transactions vs classic transactions? Are comparing potentially incorrect ad-hoc transactions vs correct transactions, if so are performance gains just not accepting it might be potentially incorrect at some point?</p><p>10:25: We’ve spoken about how ad-hoc transactions were incorrect. Can we talk about the root cause of this, what were the common mistakes people were making with ad-hoc transactions?</p><p>12:16: What was the performance gain of ad-hoc transactions?</p><p>15:47: Are there other studies of transactions in the wild? If so, how do their findings compare to yours?</p><p>18:38: What does all this mean in practice? Why don’t people just use database transactions? What puts people off using them and thinking I’ll just roll my own?</p><p>21:10: Where do you see your findings having the biggest impact?</p><p>24:42: What do you have planned for future research?</p><p>26:46: What was the most interesting or perhaps unexpected lesson you learnt whilst working on ad-hoc transactions?</p><p>29:13: What attracted you to database concurrency control research?</p><p>30:53: What is the one key thing the listener should take away from your research?</p><p><br></p><h1>Links: </h1><p><a href="https://www.youtube.com/watch?v=WPBM3UCmZ38" rel="noopener noreferrer" target="_blank">Presentation</a></p><p><a href="https://dl.acm.org/doi/10.1145/3514221.3526120" rel="noopener noreferrer" target="_blank">Paper</a></p><p><a href="https://ipads.se.sjtu.edu.cn/pub/members/chuzhe_tang" rel="noopener noreferrer" target="_blank">Chuzhe's Website</a></p><p><a href="http://www.bailis.org/papers/feral-sigmod2015.pdf" rel="noopener noreferrer" target="_blank">Feral Concurrency Control</a></p><p><a href="https://youtu.be/M2MEcvMHzkY?t=3250" rel="noopener noreferrer" target="_blank">What are we doing with our lives? Nobody cares about our concurrency control research </a></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Michael Abebe | Proteus: Autonomous Adaptive Storage for Mixed Workloads | #7 </title>
			<itunes:title>Michael Abebe | Proteus: Autonomous Adaptive Storage for Mixed Workloads | #7 </itunes:title>
			<pubDate>Mon, 18 Jul 2022 08:00:58 GMT</pubDate>
			<itunes:duration>27:57</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/62d43dc016d6e70013d2d527/media.mp3" length="26837120" type="audio/mpeg"/>
			<guid isPermaLink="false">62d43dc016d6e70013d2d527</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://cs.uwaterloo.ca/~mtabebe/publications/abebeProteus2022SIGMOD.pdf</link>
			<acast:episodeId>62d43dc016d6e70013d2d527</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>michael-abebe-proteus</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4IRv8vpPrKoMekFRU8a2YXjfaBTtpE56fNxg0P9rTQFfRs2LlVqDJaTlN+mOpFW2fcFkUuHliiY8PQL8E36JYNN]]></acast:settings>
			<itunes:subtitle>ACM SIGMOD/PODS 2022</itunes:subtitle>
			<itunes:episodeType>bonus</itunes:episodeType>
			<itunes:season>1</itunes:season>
			<itunes:episode>7</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1658076040424-8682775e889392b3651450faa5798ba4.jpeg"/>
			<description><![CDATA[<h1>Summary:</h1><p>Enterprises use distributed database systems to meet the demands of mixed or hybrid transaction/analytical processing (HTAP) workloads that contain both transactional (OLTP) and analytical (OLAP) requests. Distributed HTAP systems typically maintain a complete copy of data in row-oriented storage format that is well-suited for OLTP workloads and a second complete copy in column-oriented storage format optimised for OLAP workloads. Maintaining these data copies consumes significant storage space and system resources. Conversely, if a system stores data in a single format, OLTP or OLAP workload performance suffers.</p><br><p>In this interview, Michael talks about Proteus, a distributed HTAP database system that adaptively and autonomously selects and changes its storage layout to optimize for mixed workloads. Proteus generates physical execution plans that utilize storage-aware operators for efficient transaction execution. For HTAP workloads, Proteus delivers superior performance while providing OLTP and OLAP performance on par with designs specialized for either type of workload.</p><p><br></p><h1>Questions:</h1><p>0:56: Can you start off by explaining what a mixed workload is?&nbsp;</p><p>1:58: What is the challenge database systems face in trying to support these mixed workloads?&nbsp;</p><p>3:23: How have previous database systems tried to support mixed workloads?&nbsp;</p><p>5:19: What are the design goals of Proteus?&nbsp;</p><p>7:23: Can you elaborate more on the architecture of Proteus and how it makes decisions?&nbsp;</p><p>8:46: Can you dig into how you predict the transaction latency, what is the mechanism behind this?&nbsp;</p><p>10:35: It feels to me that you are accumulating a lot of metadata, this must have some overhead, how does this impact performance?&nbsp;</p><p>12:08: It sounds like the Adaptive Storage Advisor is a centralized coordinator, what are the limitations of this decision choice?&nbsp;&nbsp;</p><p>13:35: Are we in the context of a data-center here or can Proteus handle a geo-distributed deployment?&nbsp;</p><p>14:34: Changing the storage layout has some implicit cost, how does Proteus decide whether a storage layout change is good or bad?&nbsp;</p><p>16:57: How does Proteus predict what the transaction is going to be?</p><p>18:46: How did you evaluate Proteus?</p><p>20:20: If you had to summarize your work, what is the one key insight the listener can take away?</p><p>21:07: Is Proteus publicly available?&nbsp;</p><p>21:39: What are the next steps?&nbsp;</p><p>22:57: What is the most unexpected lesson you have learned whilst working on distributed database systems?&nbsp;</p><p>24:21: Do you think a single system catering for both workload types is better than two specialized engines?&nbsp;</p><p>26:10: What attracted you to work on this topic? </p><p><br></p><h1>Links:</h1><ul><li>Paper: <a href="https://cs.uwaterloo.ca/~mtabebe/publications/abebeProteus2022SIGMOD.pdf" rel="noopener noreferrer" target="_blank">https://cs.uwaterloo.ca/~mtabebe/publications/abebeProteus2022SIGMOD.pdf</a>&nbsp;</li><li>Presentation: <a href="https://www.youtube.com/watch?v=qbe29viYTas" rel="noopener noreferrer" target="_blank">https://www.youtube.com/watch?v=qbe29viYTas</a></li><li>Uni of Waterloo Data Systems Group: <a href="https://uwaterloo.ca/data-systems-group/" rel="noopener noreferrer" target="_blank">https://uwaterloo.ca/data-systems-group/</a>&nbsp;</li></ul><p><br></p><h1>Contact:</h1><ul><li>Website: <a href="https://cs.uwaterloo.ca/~mtabebe/" rel="noopener noreferrer" target="_blank">https://cs.uwaterloo.ca/~mtabebe/</a>&nbsp;</li><li>Email: <a href="mailto:mtabebe@uwaterloo.ca" rel="noopener noreferrer" target="_blank">mtabebe@uwaterloo.ca</a>&nbsp;</li><li>GitHub: @mtabebe</li></ul><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h1>Summary:</h1><p>Enterprises use distributed database systems to meet the demands of mixed or hybrid transaction/analytical processing (HTAP) workloads that contain both transactional (OLTP) and analytical (OLAP) requests. Distributed HTAP systems typically maintain a complete copy of data in row-oriented storage format that is well-suited for OLTP workloads and a second complete copy in column-oriented storage format optimised for OLAP workloads. Maintaining these data copies consumes significant storage space and system resources. Conversely, if a system stores data in a single format, OLTP or OLAP workload performance suffers.</p><br><p>In this interview, Michael talks about Proteus, a distributed HTAP database system that adaptively and autonomously selects and changes its storage layout to optimize for mixed workloads. Proteus generates physical execution plans that utilize storage-aware operators for efficient transaction execution. For HTAP workloads, Proteus delivers superior performance while providing OLTP and OLAP performance on par with designs specialized for either type of workload.</p><p><br></p><h1>Questions:</h1><p>0:56: Can you start off by explaining what a mixed workload is?&nbsp;</p><p>1:58: What is the challenge database systems face in trying to support these mixed workloads?&nbsp;</p><p>3:23: How have previous database systems tried to support mixed workloads?&nbsp;</p><p>5:19: What are the design goals of Proteus?&nbsp;</p><p>7:23: Can you elaborate more on the architecture of Proteus and how it makes decisions?&nbsp;</p><p>8:46: Can you dig into how you predict the transaction latency, what is the mechanism behind this?&nbsp;</p><p>10:35: It feels to me that you are accumulating a lot of metadata, this must have some overhead, how does this impact performance?&nbsp;</p><p>12:08: It sounds like the Adaptive Storage Advisor is a centralized coordinator, what are the limitations of this decision choice?&nbsp;&nbsp;</p><p>13:35: Are we in the context of a data-center here or can Proteus handle a geo-distributed deployment?&nbsp;</p><p>14:34: Changing the storage layout has some implicit cost, how does Proteus decide whether a storage layout change is good or bad?&nbsp;</p><p>16:57: How does Proteus predict what the transaction is going to be?</p><p>18:46: How did you evaluate Proteus?</p><p>20:20: If you had to summarize your work, what is the one key insight the listener can take away?</p><p>21:07: Is Proteus publicly available?&nbsp;</p><p>21:39: What are the next steps?&nbsp;</p><p>22:57: What is the most unexpected lesson you have learned whilst working on distributed database systems?&nbsp;</p><p>24:21: Do you think a single system catering for both workload types is better than two specialized engines?&nbsp;</p><p>26:10: What attracted you to work on this topic? </p><p><br></p><h1>Links:</h1><ul><li>Paper: <a href="https://cs.uwaterloo.ca/~mtabebe/publications/abebeProteus2022SIGMOD.pdf" rel="noopener noreferrer" target="_blank">https://cs.uwaterloo.ca/~mtabebe/publications/abebeProteus2022SIGMOD.pdf</a>&nbsp;</li><li>Presentation: <a href="https://www.youtube.com/watch?v=qbe29viYTas" rel="noopener noreferrer" target="_blank">https://www.youtube.com/watch?v=qbe29viYTas</a></li><li>Uni of Waterloo Data Systems Group: <a href="https://uwaterloo.ca/data-systems-group/" rel="noopener noreferrer" target="_blank">https://uwaterloo.ca/data-systems-group/</a>&nbsp;</li></ul><p><br></p><h1>Contact:</h1><ul><li>Website: <a href="https://cs.uwaterloo.ca/~mtabebe/" rel="noopener noreferrer" target="_blank">https://cs.uwaterloo.ca/~mtabebe/</a>&nbsp;</li><li>Email: <a href="mailto:mtabebe@uwaterloo.ca" rel="noopener noreferrer" target="_blank">mtabebe@uwaterloo.ca</a>&nbsp;</li><li>GitHub: @mtabebe</li></ul><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Hani Al-Sayeh | Juggler: Autonomous Cost Optimization and Performance Prediction of Big Data Applications | #6</title>
			<itunes:title>Hani Al-Sayeh | Juggler: Autonomous Cost Optimization and Performance Prediction of Big Data Applications | #6</itunes:title>
			<pubDate>Mon, 11 Jul 2022 08:00:46 GMT</pubDate>
			<itunes:duration>32:00</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/62ca1fd04fbc1e00118d5850/media.mp3" length="30730368" type="audio/mpeg"/>
			<guid isPermaLink="false">62ca1fd04fbc1e00118d5850</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/juggler</link>
			<acast:episodeId>62ca1fd04fbc1e00118d5850</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>juggler</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4JpsSonkcACvwWhNxSCMhECGaD7r3lY3RBZ69xyau8Tr1uY+byvdmVuBAILz2CWbbpXnc0DwRp4n4sPVGE+J+Zi]]></acast:settings>
			<itunes:subtitle>ACM SIGMOD/PODS 2022</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>1</itunes:season>
			<itunes:episode>6</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1657975040098-61f53b437424f4941c86cf4fdcd7c3ee.jpeg"/>
			<description><![CDATA[<h1>Summary:</h1><p>Distributed in-memory processing frameworks accelerate iterative workloads by caching suitable datasets in memory rather than recomputing them in each iteration. Selecting appropriate datasets to cache as well as allocating a suitable cluster configuration for caching these datasets play a crucial role in achieving optimal performance. In practice, both are tedious, time-consuming tasks and are often neglected by end users, who are typically not aware of workload semantics, sizes of intermediate data, and cluster specification. To address these problems, Hani and his colleagues developed Juggler, an end-to-end framework, which autonomously selects appropriate datasets for caching and recommends a correspondingly suitable cluster configuration to end users, with the aim of achieving optimal execution time and cost.</p><p><br></p><h1>Questions:</h1><p>1:02 - Can you introduce your work and describe the current workflow for developing big data applications in the cloud?</p><p>2:49 - What is the challenge (maybe hidden challenge) facing application developers in this workflow? What harms performance?</p><p>5:36 - How does Juggler solve this problem?</p><p>11:55 - As an end user, how do I interact with Juggler?</p><p>14:07 - Can you talk us through your evaluation of Juggler? What were the key insights?</p><p>16:30 - What other tools are similar to Juggler? How do they compare?</p><p>18:17 - What are the limitations of Juggler?</p><p>21:57 - Who will find Juggler the most useful? Who is it for?</p><p>24:05 - Is Juggler publicly available?</p><p>24:23 - What is the most interesting (maybe unexpected) lesson you learned while working on this topic?</p><p>27:50 - What is next for Juggler? What do you have planned for future research?</p><p>28:49 - What attracted you to this research area?&nbsp;</p><p>29:45 - What do you think is the biggest challenge now in this area?</p><p><br></p><h1>Links:</h1><ul><li><a href="https://dl.acm.org/doi/pdf/10.1145/3514221.3517892" rel="noopener noreferrer" target="_blank">Juggler: Autonomous Cost Optimization and Performance Prediction of Big Data Applications</a> (SIGMOD 2022 paper)</li><li><a href="https://www.youtube.com/watch?v=tYPm7fqHgh8" rel="noopener noreferrer" target="_blank">Juggler SIGMOD 22 presentation</a></li><li><a href="https://www.usenix.org/system/files/conference/nsdi17/nsdi17-alipourfard.pdf" rel="noopener noreferrer" target="_blank">CherryPick: Adaptively Unearthing the Best Cloud Configurations for Big Data Analytics</a> (NSDI 2017 paper)</li><li><a href="https://www.usenix.org/system/files/conference/nsdi16/nsdi16-paper-venkataraman.pdf" rel="noopener noreferrer" target="_blank">Ernest: Efficient Performance Prediction for Large-Scale Advanced Analytics</a> (NSDI 2016 paper)</li></ul><p><br></p><h1>Contact:</h1><ul><li>Email: hani-bassam.al-sayeh@tu-ilmenau.de</li><li><a href="https://www.linkedin.com/in/hani-al-sayeh-25057bb1/" rel="noopener noreferrer" target="_blank">LinkedIn</a></li><li><a href="https://www.tu-ilmenau.de/en/university/departments/department-of-computer-science-and-automation/profile/institutes-and-groups/institute-of-applied-computer-science/databases-and-information-systems-group/team" rel="noopener noreferrer" target="_blank">TU Ilmenau Database and Information Systems Group</a> </li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h1>Summary:</h1><p>Distributed in-memory processing frameworks accelerate iterative workloads by caching suitable datasets in memory rather than recomputing them in each iteration. Selecting appropriate datasets to cache as well as allocating a suitable cluster configuration for caching these datasets play a crucial role in achieving optimal performance. In practice, both are tedious, time-consuming tasks and are often neglected by end users, who are typically not aware of workload semantics, sizes of intermediate data, and cluster specification. To address these problems, Hani and his colleagues developed Juggler, an end-to-end framework, which autonomously selects appropriate datasets for caching and recommends a correspondingly suitable cluster configuration to end users, with the aim of achieving optimal execution time and cost.</p><p><br></p><h1>Questions:</h1><p>1:02 - Can you introduce your work and describe the current workflow for developing big data applications in the cloud?</p><p>2:49 - What is the challenge (maybe hidden challenge) facing application developers in this workflow? What harms performance?</p><p>5:36 - How does Juggler solve this problem?</p><p>11:55 - As an end user, how do I interact with Juggler?</p><p>14:07 - Can you talk us through your evaluation of Juggler? What were the key insights?</p><p>16:30 - What other tools are similar to Juggler? How do they compare?</p><p>18:17 - What are the limitations of Juggler?</p><p>21:57 - Who will find Juggler the most useful? Who is it for?</p><p>24:05 - Is Juggler publicly available?</p><p>24:23 - What is the most interesting (maybe unexpected) lesson you learned while working on this topic?</p><p>27:50 - What is next for Juggler? What do you have planned for future research?</p><p>28:49 - What attracted you to this research area?&nbsp;</p><p>29:45 - What do you think is the biggest challenge now in this area?</p><p><br></p><h1>Links:</h1><ul><li><a href="https://dl.acm.org/doi/pdf/10.1145/3514221.3517892" rel="noopener noreferrer" target="_blank">Juggler: Autonomous Cost Optimization and Performance Prediction of Big Data Applications</a> (SIGMOD 2022 paper)</li><li><a href="https://www.youtube.com/watch?v=tYPm7fqHgh8" rel="noopener noreferrer" target="_blank">Juggler SIGMOD 22 presentation</a></li><li><a href="https://www.usenix.org/system/files/conference/nsdi17/nsdi17-alipourfard.pdf" rel="noopener noreferrer" target="_blank">CherryPick: Adaptively Unearthing the Best Cloud Configurations for Big Data Analytics</a> (NSDI 2017 paper)</li><li><a href="https://www.usenix.org/system/files/conference/nsdi16/nsdi16-paper-venkataraman.pdf" rel="noopener noreferrer" target="_blank">Ernest: Efficient Performance Prediction for Large-Scale Advanced Analytics</a> (NSDI 2016 paper)</li></ul><p><br></p><h1>Contact:</h1><ul><li>Email: hani-bassam.al-sayeh@tu-ilmenau.de</li><li><a href="https://www.linkedin.com/in/hani-al-sayeh-25057bb1/" rel="noopener noreferrer" target="_blank">LinkedIn</a></li><li><a href="https://www.tu-ilmenau.de/en/university/departments/department-of-computer-science-and-automation/profile/institutes-and-groups/institute-of-applied-computer-science/databases-and-information-systems-group/team" rel="noopener noreferrer" target="_blank">TU Ilmenau Database and Information Systems Group</a> </li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Thomas Hütter | JEDI: These aren’t the JSON documents you’re looking for | #4</title>
			<itunes:title>Thomas Hütter | JEDI: These aren’t the JSON documents you’re looking for | #4</itunes:title>
			<pubDate>Fri, 08 Jul 2022 08:00:53 GMT</pubDate>
			<itunes:duration>11:50</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/62bd7b3c54dec20012fbd5ca/media.mp3" length="11362432" type="audio/mpeg"/>
			<guid isPermaLink="false">62bd7b3c54dec20012fbd5ca</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/jedi</link>
			<acast:episodeId>62bd7b3c54dec20012fbd5ca</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>jedi</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4I1+PCY6fq+gAVLaAbEL3Fvu/5Bni2PuIsiN0tKY4UTdGcwbqTlV1qXtrQ2o+2QoOjkzuGgxbogBLEwPfrAehTg]]></acast:settings>
			<itunes:subtitle>ACM SIGMOD/PODS 2022</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>1</itunes:season>
			<itunes:episode>4</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1657975090824-95440e1d1140a05670c9e631a208af3e.jpeg"/>
			<description><![CDATA[<h3>Summary:</h3><p><br></p><p>The JavaScript Object Notation (JSON) is a popular data format used in document stores to natively support semi-structured data.</p><p>In this interview, Thomas talks about how he addressed the problem of JSON similarity lookup queries: given a query document and a distance threshold, retrieve all documents that are within the threshold from the query document, i.e., get me all similar documents!. Different from other hierarchical formats such as XML, JSON supports both ordered and unordered sibling collections within a single document which poses a new challenge to the tree model and distance computation. Thomas talks about his proposal JSON tree, a lossless tree representation of JSON documents, and define the JSON Edit Distance (JEDI), the first edit-based distance measure for JSON. He talks about the development of QuickJEDI, an algorithm that computes JEDI by leveraging a new technique to prune expensive sibling matchings. It outperforms a baseline algorithm by an order of magnitude in runtime. Our experimental evaluation shows that our solution scales to databases with millions of documents and JSON trees with tens of thousands of nodes.</p><p><br></p><h3>Questions:</h3><h3><br></h3><p>0:47: Can you explain to the listeners what is JSON?</p><p>1:14: What is the problem you're trying to solve in your research? </p><p>1:48: What was the reason JSON was under researched? </p><p>2:13: What is the motivation for this research? Why do we need it? </p><p>2:52: What was the solution you developed to solve this problem?</p><p>4:35: How does tree edit distance work?</p><p>5:18: How do we go from tree edit distance to JEDI? </p><p>6:29: How did you evaluate JEDI? </p><p>8:31: Do other database systems provide similar functionality? </p><p>9:33: Can you tell the listeners more about AsterixDB? </p><p>10:20: What was the most challenge aspect of working on this topic?</p><p>10:59: What are the future plans for this research? </p><p>11:56: What attracted you to working on similarity queries? </p><p><br></p><h3>Links:</h3><ul><li><a href="https://dl.acm.org/doi/pdf/10.1145/3514221.3517850" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://www.youtube.com/watch?v=aSUo_34pvIg&amp;feature=youtu.be" rel="noopener noreferrer" target="_blank">SIGMOD Presentation</a></li><li><a href="https://asterixdb.apache.org/" rel="noopener noreferrer" target="_blank">AsterixDB</a></li><li><a href="mailto:%20thomas.huetter@plus.ac.at" rel="noopener noreferrer" target="_blank">thomas.huetter@plus.ac.at</a></li><li><a href="http://thuetter.github.io/" rel="noopener noreferrer" target="_blank">Homepage</a></li></ul><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary:</h3><p><br></p><p>The JavaScript Object Notation (JSON) is a popular data format used in document stores to natively support semi-structured data.</p><p>In this interview, Thomas talks about how he addressed the problem of JSON similarity lookup queries: given a query document and a distance threshold, retrieve all documents that are within the threshold from the query document, i.e., get me all similar documents!. Different from other hierarchical formats such as XML, JSON supports both ordered and unordered sibling collections within a single document which poses a new challenge to the tree model and distance computation. Thomas talks about his proposal JSON tree, a lossless tree representation of JSON documents, and define the JSON Edit Distance (JEDI), the first edit-based distance measure for JSON. He talks about the development of QuickJEDI, an algorithm that computes JEDI by leveraging a new technique to prune expensive sibling matchings. It outperforms a baseline algorithm by an order of magnitude in runtime. Our experimental evaluation shows that our solution scales to databases with millions of documents and JSON trees with tens of thousands of nodes.</p><p><br></p><h3>Questions:</h3><h3><br></h3><p>0:47: Can you explain to the listeners what is JSON?</p><p>1:14: What is the problem you're trying to solve in your research? </p><p>1:48: What was the reason JSON was under researched? </p><p>2:13: What is the motivation for this research? Why do we need it? </p><p>2:52: What was the solution you developed to solve this problem?</p><p>4:35: How does tree edit distance work?</p><p>5:18: How do we go from tree edit distance to JEDI? </p><p>6:29: How did you evaluate JEDI? </p><p>8:31: Do other database systems provide similar functionality? </p><p>9:33: Can you tell the listeners more about AsterixDB? </p><p>10:20: What was the most challenge aspect of working on this topic?</p><p>10:59: What are the future plans for this research? </p><p>11:56: What attracted you to working on similarity queries? </p><p><br></p><h3>Links:</h3><ul><li><a href="https://dl.acm.org/doi/pdf/10.1145/3514221.3517850" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://www.youtube.com/watch?v=aSUo_34pvIg&amp;feature=youtu.be" rel="noopener noreferrer" target="_blank">SIGMOD Presentation</a></li><li><a href="https://asterixdb.apache.org/" rel="noopener noreferrer" target="_blank">AsterixDB</a></li><li><a href="mailto:%20thomas.huetter@plus.ac.at" rel="noopener noreferrer" target="_blank">thomas.huetter@plus.ac.at</a></li><li><a href="http://thuetter.github.io/" rel="noopener noreferrer" target="_blank">Homepage</a></li></ul><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Sainyam Galhotra | Causal Feature Selection for Algorithmic Fairness | #5</title>
			<itunes:title>Sainyam Galhotra | Causal Feature Selection for Algorithmic Fairness | #5</itunes:title>
			<pubDate>Fri, 08 Jul 2022 08:00:41 GMT</pubDate>
			<itunes:duration>12:06</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/62bd88a354dec20012fc0fd0/media.mp3" length="11622528" type="audio/mpeg"/>
			<guid isPermaLink="false">62bd88a354dec20012fc0fd0</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/sainyam-galhotra-causal-feature-selection-for-algorithmic-fa</link>
			<acast:episodeId>62bd88a354dec20012fc0fd0</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>sainyam-galhotra-causal-feature-selection-for-algorithmic-fa</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4JAMTHlnucGDnvG+NfBX//jsUiheTYcw/3+gM2vNiM+o8PUogKg5HnVb2B2xWYp3SIIa2lZJpYl8jGLP6OYE64+]]></acast:settings>
			<itunes:subtitle>ACM SIGMOD/PODS 2022</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>1</itunes:season>
			<itunes:episode>5</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1657975525918-37a2c4dc76b3a53a9e2be1170e63d3fe.jpeg"/>
			<description><![CDATA[<h3>Summary:</h3><p><br></p><p>The use of machine learning (ML) in high-stakes societal decisions has encouraged the consideration of fairness throughout the ML lifecycle.  Although data integration is one of the primary steps to generate high-quality training data, most of the fairness literature ignores this stage. In this interview Sainyam discusses why he focuses on fairness in the integration component of data management, aiming to identify features that improve prediction without adding any bias to the dataset. Sainyam works under the causal fairness paradigm and without requiring the underlying structural causal model a priori, we has developed an approach to identify a sub-collection of features that ensure fairness of the dataset by performing conditional independence tests between different subsets of features. </p><p><br></p><h3>Questions:</h3><p>0:35: Can you introduce your work and describe the problem you're aiming to solve?</p><p>2:39: Can you elaborate on what fairness mean?</p><p>3:51: Lets dig into your solution, how does the causal approach work?</p><p>4:41: How does your approach compare to other approach into your evaluations?</p><p>6:17: How can data scientists apply your findings to the real world?</p><p>7:54: What was the most unexpected challenge you faced while working on algorithmic fairness?</p><p>8:29: What is next for your research? </p><p>9:17: Tell us about your other publications at SIGMOD? </p><p>10:57: How can the research get involved in algorithmic fairness? </p><h3><br></h3><h3>Links:</h3><ul><li><a href="https://www.youtube.com/watch?v=sBdayt-W6JA" rel="noopener noreferrer" target="_blank">SIGMOD Presentation</a></li><li><a href="https://dl.acm.org/doi/pdf/10.1145/3514221.3517909" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://sainyamgalhotra.github.io/" rel="noopener noreferrer" target="_blank">Hompage</a></li><li><a href="https://twitter.com/SainyamGalhotra" rel="noopener noreferrer" target="_blank">Twitter</a></li><li><a href="https://www.linkedin.com/in/sainyam/" rel="noopener noreferrer" target="_blank">LinkedIn</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary:</h3><p><br></p><p>The use of machine learning (ML) in high-stakes societal decisions has encouraged the consideration of fairness throughout the ML lifecycle.  Although data integration is one of the primary steps to generate high-quality training data, most of the fairness literature ignores this stage. In this interview Sainyam discusses why he focuses on fairness in the integration component of data management, aiming to identify features that improve prediction without adding any bias to the dataset. Sainyam works under the causal fairness paradigm and without requiring the underlying structural causal model a priori, we has developed an approach to identify a sub-collection of features that ensure fairness of the dataset by performing conditional independence tests between different subsets of features. </p><p><br></p><h3>Questions:</h3><p>0:35: Can you introduce your work and describe the problem you're aiming to solve?</p><p>2:39: Can you elaborate on what fairness mean?</p><p>3:51: Lets dig into your solution, how does the causal approach work?</p><p>4:41: How does your approach compare to other approach into your evaluations?</p><p>6:17: How can data scientists apply your findings to the real world?</p><p>7:54: What was the most unexpected challenge you faced while working on algorithmic fairness?</p><p>8:29: What is next for your research? </p><p>9:17: Tell us about your other publications at SIGMOD? </p><p>10:57: How can the research get involved in algorithmic fairness? </p><h3><br></h3><h3>Links:</h3><ul><li><a href="https://www.youtube.com/watch?v=sBdayt-W6JA" rel="noopener noreferrer" target="_blank">SIGMOD Presentation</a></li><li><a href="https://dl.acm.org/doi/pdf/10.1145/3514221.3517909" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://sainyamgalhotra.github.io/" rel="noopener noreferrer" target="_blank">Hompage</a></li><li><a href="https://twitter.com/SainyamGalhotra" rel="noopener noreferrer" target="_blank">Twitter</a></li><li><a href="https://www.linkedin.com/in/sainyam/" rel="noopener noreferrer" target="_blank">LinkedIn</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Draco Xu | TSUBASA: Climate Network Construction on Historical and Real-Time Data | #3</title>
			<itunes:title>Draco Xu | TSUBASA: Climate Network Construction on Historical and Real-Time Data | #3</itunes:title>
			<pubDate>Mon, 04 Jul 2022 08:00:01 GMT</pubDate>
			<itunes:duration>17:14</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/62b5ec7055189d001520b2c4/media.mp3" length="16556160" type="audio/mpeg"/>
			<guid isPermaLink="false">62b5ec7055189d001520b2c4</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/tsubasa</link>
			<acast:episodeId>62b5ec7055189d001520b2c4</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>tsubasa</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4Ltn4q8Evgm9xVaA5ke1FfKpKeK3Ri4d7RghBN8C7fe+xGfMmfSuf+QMZVhuoJAbkVjrFGwq46YUZyZmokTXOdP]]></acast:settings>
			<itunes:subtitle>ACM SIGMOD/PODS 2022</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>1</itunes:season>
			<itunes:episode>3</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1657975554897-799fe6fe3d8ebf35d309f55ddbbf674b.jpeg"/>
			<description><![CDATA[<h3>Summary: </h3><p><br></p><p>A climate network represents the global climate system by the interactions of a set of anomaly time-series. Network science has been applied on climate data to study the dynamics of a climate network. The core task and first step to enable interactive network science on climate data is the efficient construction and update of a climate network on user-defined time-windows. In this interview Draco talks about TSUBASA, an algorithm for the efficient construction of climate networks based on the exact calculation of Pearson’s correlation of large time-series. By pre-computing simple and low-overhead statistics, TSUBASA can efficiently compute the exact pairwise correlation of time-series on arbitrary time windows at query time. For real-time data, TSUBASA proposes a fast and incremental way of updating a network at interactive speed. TSUBASA is faster than approximate solutions at least one order of magnitude for both historical and real-time data and outperforms a baseline for time-series correlation calculation up to two orders of magnitude.</p><p><br></p><h3>Questions: </h3><p><br></p><p>0:54 - Can you introduce your work, describe the problem your paper is aiming to solve and the motivation for doing so?</p><p>4:11 - What is the solution you developed? How did you tackle the problem?</p><p>6:50 - What is the improvement of TSUBASA over existing work?</p><p>8.59 - Are your tools/algorithms publicly available?</p><p>10:21 - What is the most interesting lesson or challenge faced whilst working on this topic?</p><p>11:51 - What are the future directions for your research?</p><p>15:43 - Are there other domains your research can be applied to?</p><p><br></p><h3>Links: </h3><ul><li><a href="https://yun-long-xu.com/" rel="noopener noreferrer" target="_blank">Homepage</a></li><li><a href="https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdl.acm.org%2Fdoi%2Fabs%2F10.1145%2F3514221.3526177&amp;data=05%7C01%7CJ.Waudby2%40newcastle.ac.uk%7C9a760abc931f47146d4d08da58cf60b0%7C9c5012c9b61644c2a91766814fbe3e87%7C1%7C0%7C637919944953075777%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=MtQVsGqYpMB7fMvkvxhSF%2BYgVaqPTonbHnNeOfo6Di8%3D&amp;reserved=0" rel="noopener noreferrer" target="_blank">Paper</a> (<a href="https://arxiv.org/pdf/2203.16457.pdf" rel="noopener noreferrer" target="_blank">arXiv</a>)</li><li><a href="https://github.com/DataIntelligenceCrew/tsupy" rel="noopener noreferrer" target="_blank">tsupy library</a></li></ul><p><br></p><h3>Contact Info: </h3><ul><li>Email: dracoxu@stanford.edu</li><li>Twitter: <a href="https://twitter.com/DracoyunlongXu" rel="noopener noreferrer" target="_blank">@DracoyunlongXu</a> </li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary: </h3><p><br></p><p>A climate network represents the global climate system by the interactions of a set of anomaly time-series. Network science has been applied on climate data to study the dynamics of a climate network. The core task and first step to enable interactive network science on climate data is the efficient construction and update of a climate network on user-defined time-windows. In this interview Draco talks about TSUBASA, an algorithm for the efficient construction of climate networks based on the exact calculation of Pearson’s correlation of large time-series. By pre-computing simple and low-overhead statistics, TSUBASA can efficiently compute the exact pairwise correlation of time-series on arbitrary time windows at query time. For real-time data, TSUBASA proposes a fast and incremental way of updating a network at interactive speed. TSUBASA is faster than approximate solutions at least one order of magnitude for both historical and real-time data and outperforms a baseline for time-series correlation calculation up to two orders of magnitude.</p><p><br></p><h3>Questions: </h3><p><br></p><p>0:54 - Can you introduce your work, describe the problem your paper is aiming to solve and the motivation for doing so?</p><p>4:11 - What is the solution you developed? How did you tackle the problem?</p><p>6:50 - What is the improvement of TSUBASA over existing work?</p><p>8.59 - Are your tools/algorithms publicly available?</p><p>10:21 - What is the most interesting lesson or challenge faced whilst working on this topic?</p><p>11:51 - What are the future directions for your research?</p><p>15:43 - Are there other domains your research can be applied to?</p><p><br></p><h3>Links: </h3><ul><li><a href="https://yun-long-xu.com/" rel="noopener noreferrer" target="_blank">Homepage</a></li><li><a href="https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fdl.acm.org%2Fdoi%2Fabs%2F10.1145%2F3514221.3526177&amp;data=05%7C01%7CJ.Waudby2%40newcastle.ac.uk%7C9a760abc931f47146d4d08da58cf60b0%7C9c5012c9b61644c2a91766814fbe3e87%7C1%7C0%7C637919944953075777%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&amp;sdata=MtQVsGqYpMB7fMvkvxhSF%2BYgVaqPTonbHnNeOfo6Di8%3D&amp;reserved=0" rel="noopener noreferrer" target="_blank">Paper</a> (<a href="https://arxiv.org/pdf/2203.16457.pdf" rel="noopener noreferrer" target="_blank">arXiv</a>)</li><li><a href="https://github.com/DataIntelligenceCrew/tsupy" rel="noopener noreferrer" target="_blank">tsupy library</a></li></ul><p><br></p><h3>Contact Info: </h3><ul><li>Email: dracoxu@stanford.edu</li><li>Twitter: <a href="https://twitter.com/DracoyunlongXu" rel="noopener noreferrer" target="_blank">@DracoyunlongXu</a> </li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Felix S Campbell | Efficient Answering of Historical What-if Queries | #2</title>
			<itunes:title>Felix S Campbell | Efficient Answering of Historical What-if Queries | #2</itunes:title>
			<pubDate>Fri, 01 Jul 2022 08:00:25 GMT</pubDate>
			<itunes:duration>19:21</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/62bc6599cf1ab400126be7e3/media.mp3" length="18587776" type="audio/mpeg"/>
			<guid isPermaLink="false">62bc6599cf1ab400126be7e3</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/what-if-queries</link>
			<acast:episodeId>62bc6599cf1ab400126be7e3</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>what-if-queries</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4LrU7Kz0cxlZajazb6jQsh4lK8jwkGuiff0QYufFwECxFPlH8yXyLFkAeWzy/eAdavlybLXnn8mCSQxecQ/OuC6]]></acast:settings>
			<itunes:subtitle>ACM SIGMOD/PODS 2022</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>1</itunes:season>
			<itunes:episode>2</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1657975670072-aa65c287aa0eb58782ed1beb53575f5e.jpeg"/>
			<description><![CDATA[<h3>Summary:</h3><p><br></p><p>In this interview Felix discusses "historical what-if queries", a novel type of what-if analysis that determines the effect of a hypothetical change to the transactional history of a database. For example, “how would revenue be affected if we would have charged an additional $6 for shipping?” In his research Felix has developed efficient techniques for answering these historical what-if queries, i.e., determining how a modified history affects the current database state. During the show, Felix talks about reenactment, a replay technique for transactional histories, and how he and his co-authors optimize this process using program and data slicing techniques to determine which updates and what data can be excluded from reenactment without affecting the result. </p><p><br></p><h3>Questions:</h3><p>0:42: Can you start off by explaining what are historical what-if queries?</p><p>1:56: What is the naive approach to answering these types of questions?</p><p>2:47: What are the problems with this naive approach and why is your solution better?</p><p>3:45: Tell us about reenactment, how does that work?</p><p>4:48: In your paper you mention two additional techniques, data slicing and program slicing, can you tell us more about these?  </p><p>6:44: How does reenactment, data slicing and program slicing, compare to other techniques in the literature? Where does it improve on the pitfalls of those? </p><p>8:00: Are there any commercial DBMSs that provide similar functionality out of the box?</p><p>8:57: How did you go about evaluation your solution? </p><p>10:40: What are the parameters you varied in your evaluation? </p><p>14:11: Where do you see this research being most useful? Who can use this?</p><p>15:17: Are the code/toolkit publicly available? </p><p>16:15: What is the most interesting aspect of working on what-if queries and more generally in the area of data provenance? </p><p>17:36: What do you have planned for future research? </p><br><p><br></p><p><br></p><h3>Links:</h3><ul><li><a href="https://www.fsalc.net/" rel="noopener noreferrer" target="_blank">Felix's homepage</a></li><li><a href="https://cs.iit.edu/~dbgroup/" rel="noopener noreferrer" target="_blank">Illinois Institute of Technology (IIT) Database Group's homepage</a> </li><li><a href="https://dl.acm.org/doi/pdf/10.1145/3514221.3526138" rel="noopener noreferrer" target="_blank">Efficient Answering of Historical What-if Queries</a> SIGMOD paper</li><li><a href="https://www.youtube.com/watch?v=6O0InOM-ZbI" rel="noopener noreferrer" target="_blank">SIGMOD presentation</a></li></ul><p><br></p><h3>Contact Info:</h3><ul><li>Email: fcampbell@hawk.iit.edu</li><li><a href="https://www.linkedin.com/in/felix-campbell/" rel="noopener noreferrer" target="_blank">LinkedIn</a></li></ul><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary:</h3><p><br></p><p>In this interview Felix discusses "historical what-if queries", a novel type of what-if analysis that determines the effect of a hypothetical change to the transactional history of a database. For example, “how would revenue be affected if we would have charged an additional $6 for shipping?” In his research Felix has developed efficient techniques for answering these historical what-if queries, i.e., determining how a modified history affects the current database state. During the show, Felix talks about reenactment, a replay technique for transactional histories, and how he and his co-authors optimize this process using program and data slicing techniques to determine which updates and what data can be excluded from reenactment without affecting the result. </p><p><br></p><h3>Questions:</h3><p>0:42: Can you start off by explaining what are historical what-if queries?</p><p>1:56: What is the naive approach to answering these types of questions?</p><p>2:47: What are the problems with this naive approach and why is your solution better?</p><p>3:45: Tell us about reenactment, how does that work?</p><p>4:48: In your paper you mention two additional techniques, data slicing and program slicing, can you tell us more about these?  </p><p>6:44: How does reenactment, data slicing and program slicing, compare to other techniques in the literature? Where does it improve on the pitfalls of those? </p><p>8:00: Are there any commercial DBMSs that provide similar functionality out of the box?</p><p>8:57: How did you go about evaluation your solution? </p><p>10:40: What are the parameters you varied in your evaluation? </p><p>14:11: Where do you see this research being most useful? Who can use this?</p><p>15:17: Are the code/toolkit publicly available? </p><p>16:15: What is the most interesting aspect of working on what-if queries and more generally in the area of data provenance? </p><p>17:36: What do you have planned for future research? </p><br><p><br></p><p><br></p><h3>Links:</h3><ul><li><a href="https://www.fsalc.net/" rel="noopener noreferrer" target="_blank">Felix's homepage</a></li><li><a href="https://cs.iit.edu/~dbgroup/" rel="noopener noreferrer" target="_blank">Illinois Institute of Technology (IIT) Database Group's homepage</a> </li><li><a href="https://dl.acm.org/doi/pdf/10.1145/3514221.3526138" rel="noopener noreferrer" target="_blank">Efficient Answering of Historical What-if Queries</a> SIGMOD paper</li><li><a href="https://www.youtube.com/watch?v=6O0InOM-ZbI" rel="noopener noreferrer" target="_blank">SIGMOD presentation</a></li></ul><p><br></p><h3>Contact Info:</h3><ul><li>Email: fcampbell@hawk.iit.edu</li><li><a href="https://www.linkedin.com/in/felix-campbell/" rel="noopener noreferrer" target="_blank">LinkedIn</a></li></ul><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Alex Isenko | Where Is My Training Bottleneck? Hidden Trade-Offs in Deep Learning Preprocessing Pipelines | #1</title>
			<itunes:title>Alex Isenko | Where Is My Training Bottleneck? Hidden Trade-Offs in Deep Learning Preprocessing Pipelines | #1</itunes:title>
			<pubDate>Mon, 27 Jun 2022 08:00:24 GMT</pubDate>
			<itunes:duration>24:32</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/62b8a376bbf6a40012ea327b/media.mp3" length="23564416" type="audio/mpeg"/>
			<guid isPermaLink="false">62b8a376bbf6a40012ea327b</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/where-is-my-training-bottleneck</link>
			<acast:episodeId>62b8a376bbf6a40012ea327b</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>where-is-my-training-bottleneck</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4JjTxNnkwoIK6hT2HgtIPM0+KtjN9eKFkM2K9140k24+2Bo9w3pelGfLqYZT+AtIz5OZxAyVnd2q6Nz3PFr1LFM]]></acast:settings>
			<itunes:subtitle>ACM SIGMOD/PODS 2022</itunes:subtitle>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>1</itunes:season>
			<itunes:episode>1</itunes:episode>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1657975616211-9ae6856367c6fa89aed953d15fffde4c.jpeg"/>
			<description><![CDATA[<h3>Summary: </h3><p><br></p><p>Preprocessing pipelines in deep learning aim to provide sufficient data throughput to keep the training processes busy. Maximizing resource utilization is becoming more challenging as the throughput of training processes increases with hardware innovations (e.g., faster GPUs, TPUs, and inter-connects) and advanced parallelization techniques that yield better scalability. At the same time, the amount of training data needed in order to train increasingly complex models is growing. As a consequence of this development, data preprocessing and provisioning are becoming a severe bottleneck in end-to-end deep learning pipelines.</p><br><p>In this interview Alex talks about his in-depth analysis of data preprocessing pipelines from four different machine learning domains. Additionally, he discusses a new perspective on efficiently preparing datasets for end-to-end deep learning pipelines and extract individual trade-offs to optimize throughput, preprocessing time, and storage consumption. Alex and his collaborators have developed an open-source profiling library that can automatically decide on a suitable preprocessing strategy to maximize throughput. By applying their generated insights to real-world use-cases, an increased throughput of 3x to 13x can be obtained compared to an untuned system while keeping the pipeline functionally identical. These findings show the enormous potential of data pipeline tuning.</p><p><br></p><h3>Questions: </h3><p>0:36 - Can you explain to our listeners what is a deep learning pipeline?</p><p>1:33 - In this pipepline how does data pre-processing become a bottleneck?</p><p>5:40 - In the paper you analyse several different domains, can you go into more details about the domains and pipelines?</p><p>6:49 - What are the key insights from your analysis?</p><p>8:28 - What are the other insights?</p><p>13:23 - Your paper introduces PRESTO the opens source profiling library, can you tell us more about that?</p><p>15:56 - How does this compare to other tools in the space?</p><p>18:46 - Who will find PRESTO useful?</p><p>20:13 - What is the most interesting, unexpected, or challenging lesson you encountered whilst working on this topic?</p><p>22:10 - What do you have planned for future research?</p><p><br></p><h3>Links: </h3><ul><li><a href="https://www.cs.cit.tum.de/msrg/team/alexander-isenko/" rel="noopener noreferrer" target="_blank">Homepage</a></li><li><a href="https://arxiv.org/pdf/2202.08679.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://github.com/cirquit/presto" rel="noopener noreferrer" target="_blank">PRESTO</a></li></ul><p><br></p><h3>Contact Info: </h3><ul><li>Email: alex.isenko@tum.de</li><li><a href="https://www.linkedin.com/in/alexanderisenko/?originalSubdomain=de" rel="noopener noreferrer" target="_blank">LinkedIn</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<h3>Summary: </h3><p><br></p><p>Preprocessing pipelines in deep learning aim to provide sufficient data throughput to keep the training processes busy. Maximizing resource utilization is becoming more challenging as the throughput of training processes increases with hardware innovations (e.g., faster GPUs, TPUs, and inter-connects) and advanced parallelization techniques that yield better scalability. At the same time, the amount of training data needed in order to train increasingly complex models is growing. As a consequence of this development, data preprocessing and provisioning are becoming a severe bottleneck in end-to-end deep learning pipelines.</p><br><p>In this interview Alex talks about his in-depth analysis of data preprocessing pipelines from four different machine learning domains. Additionally, he discusses a new perspective on efficiently preparing datasets for end-to-end deep learning pipelines and extract individual trade-offs to optimize throughput, preprocessing time, and storage consumption. Alex and his collaborators have developed an open-source profiling library that can automatically decide on a suitable preprocessing strategy to maximize throughput. By applying their generated insights to real-world use-cases, an increased throughput of 3x to 13x can be obtained compared to an untuned system while keeping the pipeline functionally identical. These findings show the enormous potential of data pipeline tuning.</p><p><br></p><h3>Questions: </h3><p>0:36 - Can you explain to our listeners what is a deep learning pipeline?</p><p>1:33 - In this pipepline how does data pre-processing become a bottleneck?</p><p>5:40 - In the paper you analyse several different domains, can you go into more details about the domains and pipelines?</p><p>6:49 - What are the key insights from your analysis?</p><p>8:28 - What are the other insights?</p><p>13:23 - Your paper introduces PRESTO the opens source profiling library, can you tell us more about that?</p><p>15:56 - How does this compare to other tools in the space?</p><p>18:46 - Who will find PRESTO useful?</p><p>20:13 - What is the most interesting, unexpected, or challenging lesson you encountered whilst working on this topic?</p><p>22:10 - What do you have planned for future research?</p><p><br></p><h3>Links: </h3><ul><li><a href="https://www.cs.cit.tum.de/msrg/team/alexander-isenko/" rel="noopener noreferrer" target="_blank">Homepage</a></li><li><a href="https://arxiv.org/pdf/2202.08679.pdf" rel="noopener noreferrer" target="_blank">Paper</a></li><li><a href="https://github.com/cirquit/presto" rel="noopener noreferrer" target="_blank">PRESTO</a></li></ul><p><br></p><h3>Contact Info: </h3><ul><li>Email: alex.isenko@tum.de</li><li><a href="https://www.linkedin.com/in/alexanderisenko/?originalSubdomain=de" rel="noopener noreferrer" target="_blank">LinkedIn</a></li></ul><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
		<item>
			<title>Coming Soon | ACM SIGMOD/PODS 2022 | #0</title>
			<itunes:title>Coming Soon | ACM SIGMOD/PODS 2022 | #0</itunes:title>
			<pubDate>Fri, 03 Jun 2022 19:57:25 GMT</pubDate>
			<itunes:duration>1:30</itunes:duration>
			<enclosure url="https://sphinx.acast.com/p/open/s/629a6154b4e1e70012764c00/e/629a67a55da2db0012b464ba/media.mp3" length="26058344" type="audio/mpeg"/>
			<guid isPermaLink="false">629a67a55da2db0012b464ba</guid>
			<itunes:explicit>false</itunes:explicit>
			<link>https://shows.acast.com/disseminate/episodes/test</link>
			<acast:episodeId>629a67a55da2db0012b464ba</acast:episodeId>
			<acast:showId>629a6154b4e1e70012764c00</acast:showId>
			<acast:episodeUrl>test</acast:episodeUrl>
			<acast:settings><![CDATA[FYjHyZbXWHZ7gmX8Pp1rmbKbhgrQiwYShz70Q9/ffXZMTtedvdcRQbP4eiLMjXzCKLPjEYLpGj+NMVKa+5C8pL4u/EOj1Vw4h5MMJYp0lCcFAe0fnxBJy/1ju4Qxy1fh8gO4DvlGA40yms2g0/hOkcrfHIopjTygHFqGwwOPKFIai4SuTvs86Lx3UYCyl6ZsNG3kc38GrmfZwINfPZXEKgypMDEHaetQPjZ5uUBCs4KUaB/C/tL4OilDFl5g0CYlA28Nf6Cji3JwI1ogPuTsyfV2X27HidyK/EBkC7JEt+r0nwGP+GHFCDyJvT6KmN+1]]></acast:settings>
			<itunes:episodeType>full</itunes:episodeType>
			<itunes:season>1</itunes:season>
			<itunes:image href="https://assets.pippa.io/shows/629a6154b4e1e70012764c00/1657975702915-6ff3a43592802bdf56bf43db3b35ff42.jpeg"/>
			<description><![CDATA[<p>Welcome to Disseminate! The podcast bringing you the cutting edge of Computer Science research in a digestible format. Each series will focus on papers published at a specific Computer Science conference, e.g., SIGMOD, CVPR, so we will cover a wide range of topics from distributed systems to computer vision. Each episode within a series will feature an interview with the author(s) of a paper published at that conference. The podcasts aims to be an alternative source of information for industry practitioners, researchers, and students. The podcast will be of particular use to practitioners as there will be a focus on the practical relevance of research, in an attempt to help bridge the gap between industry and academia. Also, as many interesting ideas/breakthroughs come from the cross pollination of different disciplines within Computer Science, researchers should also find the podcast useful, in addition to it being a source for keeping up with research in their own research area. For students hopefully disseminate will be a useful learning tool.</p><br><p>The first season will focus on the 2022 ACM SIGMOD/PODS International Conference on Management of Data, which is taking place in Philadelphia from Sunday, 12 June to Friday, 17 June. Episodes will start being released in the weeks following the conference. </p><br><p>We look forward to you joining us on this journey! </p><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></description>
			<itunes:summary><![CDATA[<p>Welcome to Disseminate! The podcast bringing you the cutting edge of Computer Science research in a digestible format. Each series will focus on papers published at a specific Computer Science conference, e.g., SIGMOD, CVPR, so we will cover a wide range of topics from distributed systems to computer vision. Each episode within a series will feature an interview with the author(s) of a paper published at that conference. The podcasts aims to be an alternative source of information for industry practitioners, researchers, and students. The podcast will be of particular use to practitioners as there will be a focus on the practical relevance of research, in an attempt to help bridge the gap between industry and academia. Also, as many interesting ideas/breakthroughs come from the cross pollination of different disciplines within Computer Science, researchers should also find the podcast useful, in addition to it being a source for keeping up with research in their own research area. For students hopefully disseminate will be a useful learning tool.</p><br><p>The first season will focus on the 2022 ACM SIGMOD/PODS International Conference on Management of Data, which is taking place in Philadelphia from Sunday, 12 June to Friday, 17 June. Episodes will start being released in the weeks following the conference. </p><br><p>We look forward to you joining us on this journey! </p><p><br></p><hr><p style='color:grey; font-size:0.75em;'> Hosted on Acast. See <a style='color:grey;' target='_blank' rel='noopener noreferrer' href='https://acast.com/privacy'>acast.com/privacy</a> for more information.</p>]]></itunes:summary>
		</item>
    	<itunes:category text="Education"/>
    	<itunes:category text="Technology"/>
		<itunes:category text="News">
			<itunes:category text="Tech News"/>
		</itunes:category>
    </channel>
</rss>
