Batch Processing from the Ground Up: Part IIIIn the previous article, we dived into MapReduce. MapReduce took the data processing world by storm for providing fast and distributed…Oct 26, 2020Oct 26, 2020
Batch Processing from the Ground Up: Part IIWe last left off on Unix pipelines and how its philosophy can help us scale up batch processing on a distributed network. We then…Oct 24, 2020Oct 24, 2020
Batch Processing from the Ground Up: Part IBatch Processing Systems: A system that takes a large amount of input data and runs a job to process it and produces some output data…Oct 9, 2020Oct 9, 2020
Databases from the ground up Part IIIWe have shown how to think about data retrieval systems. We have covered LSM-Trees, B-Trees, how to think of segment files, and how memory…Oct 5, 2020Oct 5, 2020
Databases from the ground up Part IIWe last left off at hash indexes. We explored how hash indexes sped up the retrieval of data by keeping the offset of the key within a…Oct 1, 2020Oct 1, 2020
Databases from the ground up Part IIn the growing world of data lingo, you might have heard Online Analytical Processing(OLAP), Online Transaction Processing(OLTP), and Data…Sep 27, 2020Sep 27, 2020
A dive into Data Models; how should you structure your data?The way we structure data inherently affects how we think and reason about the problem. For example, in a declarative language like SQL…Sep 24, 2020Sep 24, 2020
Reliability, Scalability, and Maintainability is all a Data System NeedsData has become the forefront of powering many applications. From complex machine learning algorithms to social media apps, to government…Sep 23, 2020Sep 23, 2020