Make Your RStudio on Databricks More Durable and Resilient
One of the questions that we often hear from our customers these days is, “Should I develop my solution in Python or R?” There is no right or wrong answer to this question, as it largely depends on the...
View ArticleSolution Accelerator: Multi-touch Attribution
Behind the growth of every consumer-facing product is the acquisition and retention of an engaged user base. When it comes to customer acquisition, the goal is to attract high-quality users as cost...
View ArticleImproving On-Shelf Availability for Items with AI Out of Stock Modeling
This post was written in collaboration with Databricks partner Tredence. We thank Rich Williams, Vice President Data Engineering, and Morgan Seybert, Chief Business Officer, of Tredence for their...
View ArticleHow to Manage End-to-end Deep Learning Pipelines w/ Databricks
Deep Learning (DL) models are being applied to use cases across all industries — fraud detection in financial services, personalization in media, image recognition in healthcare and more. With this...
View ArticleAnnouncing Databricks Autologging for Automated ML Experiment Tracking
Machine learning teams require the ability to reproduce and explain their results–whether for regulatory, debugging or other purposes. This means every production model must have a record of its...
View ArticleAnnouncing Databricks Serverless SQL
Databricks SQL already provides a first-class user experience for BI and SQL directly on the data lake, and today, we are excited to announce another step in making data and AI simple with Databricks...
View ArticleFrequently Asked Questions About the Data Lakehouse
Question Index What is a Data Lakehouse? How is a Data Lakehouse different from a Data Warehouse? How is the Lakehouse different from a Data Lake? How easy is it for data analysts to use a Data...
View ArticleHow Incremental ETL Makes Life Simpler With Data Lakes
Incremental ETL (Extract, Transform and Load) in a conventional data warehouse has become commonplace with CDC (change data capture) sources, but scale, cost, accounting for state and the lack of...
View ArticleInfrastructure Design for Real-time Machine Learning Inference
This is a guest authored post by Yu Chen, Senior Software Engineer, Headspace. Headspace’s core products are iOS, Android and web-based apps that focus on improving the health and happiness of its...
View ArticleAnnouncing the Launch of Delta Live Tables on Google Cloud
Today, we are excited to announce the availability of Delta Live Tables (DLT) on Google Cloud. With this launch, enterprises can now use DLT to easily build and deploy SQL and Python pipelines and run...
View ArticleImplementing More Effective FAIR Scientific Data Management With a Lakehouse
Data powers scientific discovery and innovation. But data is only as good as its data management strategy, the key factor in ensuring data quality, accessibility, and reproducibility of results – all...
View ArticleNew Performance Improvements in Databricks SQL
Originally announced at Data + AI Summit 2020 Europe, Databricks SQL lets you operate a multi-cloud lakehouse architecture that provides data warehousing performance at data lake economics. Our vision...
View ArticleIntroducing the Databricks Community: Online Discussions for Data + AI...
At Databricks, we know that impactful data transformation in any organization starts with empowered individuals. This is why we are excited to launch the Databricks Community to serve as an engaging...
View Article5 Steps to Implementing Intelligent Data Pipelines With Delta Live Tables
Many IT organizations are familiar with the traditional extract, transform and load (ETL) process – as a series of steps defined to move and transform data from source to traditional data warehouses...
View ArticleAnnouncing Public Preview of Low Shuffle Merge
Today, we are excited to announce the public preview of Low Shuffle Merge in Delta Lake, available on AWS, Azure, and Google Cloud. This new and improved MERGE algorithm is substantially faster and...
View ArticleReal-time Point-of-Sale Analytics With a Data Lakehouse
Disruptions in the supply chain – from reduced product supply and diminished warehouse capacity – coupled with rapidly shifting consumer expectations for seamless omnichannel experiences are driving...
View Article4 Ways AI Can Future-proof Financial Services’ Risk and Compliance
The core function of a bank is to protect assets, identify risks and mitigate losses by protecting customers from fraud, money laundering and other financial crimes. In today’s interconnected and...
View ArticleLarge Scale ETL and Lakehouse Implementation at Asurion
This is a guest post from Tomasz Magdanski, Director of Engineering, Asurion. With its insurance and installation, repair, replacement and 24/7 support services, Asurion helps people protect,...
View ArticleTimeliness and Reliability in the Transmission of Regulatory Reports
Managing risk and regulatory compliance is an increasingly complex and costly endeavour. Regulatory change has increased 500% since the 2008 global financial crisis and boosted the regulatory costs in...
View ArticlePart 1: Implementing CI/CD on Databricks Using Databricks Notebooks and Azure...
Discussed code can be found here. This is the first part of a two-part series of blog posts that show how to configure and build end-to-end MLOps solutions on Databricks with notebooks and Repos API....
View Article