How to Simplify CDC With Delta Lake’s Change Data Feed
Try this notebook in Databricks Change data capture (CDC) is a use case that we see many customers implement in Databricks – you can check out our previous deep dive on the topic here. Typically we...
View ArticleHow to Build a Scalable Wide and Deep Product Recommender
Download the notebooks referenced throughout this article. I have a favorite coffee shop I’ve been visiting for years. When I walk in, the barista knows me by name and asks if I’d like my usual drink....
View ArticleSolution Accelerator: Toxicity Detection in Gaming
Across massively multiplayer online video games (MMOs), multiplayer online battle arena games (MOBAs) and other forms of online gaming, players continuously interact in real time to either coordinate...
View ArticleAnnouncing Photon Public Preview: The Next Generation Query Engine on the...
Today, we’re excited to announce the availability of Photon in public preview. Photon is a native vectorized engine developed in C++ to dramatically improve query performance. All you have to do to...
View ArticleThe Modern Chief Data Officer: Transitioning From Defense to Offense
The Chief Data Officer (CDO) is not a new position – Capital One reportedly had a CDO all the way back in 2002. But only recently has it become a mainstream, business-critical role for enterprises. In...
View ArticleHow Databricks Supports Digital Native Companies in Their Hyper-growth Journey
In a recent panel discussion, Richard Zananiri, Director of EMEA Mid-market at Databricks, was joined by four globally operating, high-growth cloud-native companies. Each organization is at the...
View ArticleGet Your Free Copy of Delta Lake: The Definitive Guide (Early Release)
At the Data + AI Summit, we were thrilled to announce the early release of Delta Lake: The Definitive Guide, published by O’Reilly. The guide teaches how to build a modern lakehouse architecture that...
View ArticleNeed for Data-centric ML Platforms
This blog is the first in a series on MLOps and Model Governance. The next blog will be by Joseph Bradley and will discuss how to choose the right technologies for data science and machine learning...
View ArticleThree Principles for Selecting Machine Learning Platforms
This blog post is the second in a series on ML platforms, operations, and governance. For the first post, see Rafi Kurlansik’s post on the “Need for Data-centric ML Platforms.” I recently spoke with...
View ArticleUsing Bayesian Hierarchical Models to Infer the Disease Parameters of COVID-19
In a previous post, we looked at how to use PyMC3 to model the disease dynamics of COVID-19. This post builds on this use case and explores how to use Bayesian hierarchical models to infer COVID-19...
View ArticleDatabricks Solutions Showcase
Inspiration doesn’t always come from our peers. Some of the best ideas come from innovators in other industries.This is especially true when it comes to data science and AI/ML innovations being applied...
View ArticleApplying Natural Language Processing to Healthcare Text at Scale
This is a co-authored post written in collaboration with John Snow Labs. We thank Moritz Steller, senior cloud solution architect, at John Snow Labs for his contributions. In 2015, HIMSS estimated...
View ArticleA Shared Vision for Data Teams: Why Cubonacci Joined Databricks
Today, we are excited to announce that our company, Cubonacci, has joined the Databricks family. We founded Cubonacci in Amsterdam to enable businesses to build scalable and future-proof data science...
View ArticleDemocratizing Data and AI in Finserv: DAIS 2021 Takeaways
For financial services providers, driving business forward with data is a longstanding practice—but as machine learning (ML) and artificial intelligence (AI) technologies improve, understanding the...
View ArticleFour E-commerce Challenges That Can Be Addressed With Data + AI
The global health crisis accelerated the adoption of omnichannel shopping and fulfillment. Consumers spent $861.12 billion online with US merchants in 2020, up an incredible 44% compared to the...
View ArticleDown to the Individual Grain: How John Deere Uses Industrial AI to Increase...
Recently, The Verge spoke with Jahmy Hindman, CTO at John Deere, about the transformation of the company’s farm equipment over the last three decades from purely mechanical to, as Jahmy calls them,...
View ArticleNow in Databricks: Orchestrate Multiple Tasks With Databricks Jobs
READ DOCUMENTATION As companies undertake more business intelligence (BI) and artificial intelligence (AI) initiatives, the need for simple, clear and reliable orchestration of data processing tasks...
View ArticleUsing Your Data to Stop Credit Card Fraud: Capital One and Other Best Practices
Fraud is a costly and growing problem – research estimates that $1 of fraud costs companies 3.36x in chargeback, replacement and operational cost. Adding to the pain, according to experts, there are...
View ArticleDriving Transformation at Northwestern Mutual (Insights Platform) by Moving...
This is a guest authored post by Manhu Kotian, Vice President of Engineering ( Investment Products Data, CRM, Apps and Reporting) at Northwestern Mutual. Digital Transformation has been front and...
View ArticleFeature Engineering at Scale
Feature engineering is one of the most important and time-consuming steps of the machine learning process. Data scientists and analysts often find themselves spending a lot of time experimenting with...
View Article