Real-Time Decision Engine using Spark Structured Streaming + ML

Real-time decision making using ML/AI is the holy grail of customer-facing applications. It’s no longer a long-shot dream; it’s our new reality. The real-time decision engine leverages the latest features in Apache Spark 2.3, including stream-to-stream joins and Spark ML, to directly improve the customer experience. We will discuss the architecture at length, including data… Continue reading


Snowflake: Getting Started with Walkthrough

What is Snowflake? Snowflake is a new era relational SQL data warehouse built for the cloud that seeks to enable seamless and fully elastic access to business-critical data that satisfies everyone from Analysts to IT to Finance. But why – aren’t there enough Data Warehouses already?! tl;dr Quantity != Quality. Snowflake offers decoupled elastic compute… Continue reading


Runtime Stats for Functions | Python Decorator

In a similar vein to my prior Python decorator metadata for functions (“meta_func” => github | PyPi | blog), this decorator is intended to help illuminate the number of calls and time taken per call aggregates. It will keep track of each function by its uniquely assigned python object identifier, the total number of function… Continue reading