Citus Blog

Articles tagged: AzureDBPostgres

Claire Giordano

UK COVID-19 dashboard built using Postgres and Citus for millions of users

Written byBy Claire Giordano & Pouria Hadjibagheri | December 11, 2021Dec 11, 2021

From the beginning of the COVID-19 pandemic, the United Kingdom (UK) government has made it a top priority to track key health metrics and to share those metrics with the public.

And the citizens of the UK were hungry for information, as they tried to make sense of what was happening. Maps, graphs, and tables became the lingua franca of the pandemic. As a result, the GOV.UK Coronavirus dashboard became one of the most visited public service websites in the United Kingdom.

The list of people who rely on the UK Coronavirus dashboard is quite long: government personnel, public health officials, healthcare employees, journalists, and the public all use the same service.

Keep reading

There is some good news for those of you wanting to shard your Postgres database in the cloud, so that as your data grows you have an easy way to scale out your Postgres database. I’m delighted to announce that Citus 10—the latest open source release of the Citus extension to Postgres—is now generally available in Hyperscale (Citus).

Hyperscale (Citus) is a built-in option in the Azure Database for PostgreSQL managed service, which has been around for a couple of years to help those of you who would rather focus on your application—and not on spending cycles managing your database.

Keep reading

It’s been an eventful time for Hyperscale (Citus) lately. If you’re interested in Postgres, distributed databases, and how to handle ever growing needs for your Postgres application or simply use Hyperscale (Citus), keep reading.

Citus is an open source extension to Postgres that enables horizontal scaling of your Postgres database. Citus distributes your Postgres tables, writes, and SQL queries across multiple nodes—parallelizing your workload and enabling you to use the memory, compute, and disk of a multi-node cluster. And Citus is available on Azure: Hyperscale (Citus) is a deployment option in Azure Database for PostgreSQL.

What’s really exciting to me is that we’ve made it easier and cheaper than ever to try and use Hyperscale (Citus). With Basic tier, you can now use Hyperscale (Citus) on a single node, parallelizing your operations and adopting a distributed database model from the very beginning. And you can now try Citus open source with a single docker run command—boom!

Keep reading

Citus is an extension to Postgres that lets you distribute your application’s workload across multiple nodes. Whether you are using Citus open source or using Citus as part of a managed Postgres service in the cloud, one of the first things you do when you start using Citus is to distribute your tables. While distributing your Postgres tables you need to decide on some properties such as distribution column, shard count, colocation. And even before you decide on your distribution column (sometimes called a distribution key, or a sharding key), when you create a Postgres table, your table is created with an access method.

Previously you had to decide on these table properties up front, and then you went with your decision. Or if you really wanted to change your decision, you needed to start over. The good news is that in Citus 10, we introduced 2 new user-defined functions (UDFs) to make it easier for you to make changes to your distributed Postgres tables.

Keep reading

Once you start using the Citus extension to distribute your Postgres database, you may never want to go back. But what if you just want to experiment with Citus and want to have the comfort of knowing you can go back? Well, as of Citus 9.5, now there is a new undistribute_table() function to make it easy for you to, well, to revert a distributed table back to being a regular Postgres table.

If you are familiar with Citus, you know that Citus is an open source extension to Postgres that distributes your data (and queries) to multiple machines in a cluster—thereby parallelizing your workload and scaling your Postgres database horizontally. When you start using Citus—whether you’re using Citus open source or whether you’re using Citus as part of a managed service in the cloud—usually the first thing you need to do is distribute your Postgres tables across the cluster.

Keep reading
Claire Giordano

When to use Hyperscale (Citus) to scale out Postgres

Written byBy Claire Giordano | December 5, 2020Dec 5, 2020

If you’ve built your application on Postgres, you already know why so many people love Postgres.

And if you’re new to Postgres, the list of reasons people love Postgres is loooong—and includes things like: 3 decades of database reliability baked in; rich datatypes; support for custom types; myriad index types from B-tree to GIN to BRIN to GiST; support for JSON and JSONB from early days; constraints; foreign data wrappers; rollups; the geospatial capabilities of the PostGIS extension, and all the innovations that come from the many Postgres extensions.

But what to do if your Postgres database gets very large?

Keep reading

In my work as an engineer on the Postgres team at Microsoft, I get to meet all sorts of customers going through many challenging projects. One recent database migration project I worked on is a story that just needs to be told. The customer—in the retail space—was using Redshift as the data warehouse and Databricks as their ETL engine. Their setup was deployed on AWS and GCP, across different data centers in different regions. And they’d been running into performance bottlenecks and also was incurring unnecessary egress cost.

Specifically, the amount of data in our customer’s analytic store was growing faster than the compute required to process that data. AWS Redshift was not able to offer independent scaling of storage and compute—hence our customer was paying extra cost by being forced to scale up the Redshift nodes to account for growing data volumes. To address these issues, they decided to migrate their analytics landscape to Azure.

Keep reading

When working on the internals of Citus, an open source extension to Postgres that transforms Postgres into a distributed database, we often get to talk with customers that have interesting challenges you won’t find everywhere. Just a few months back, I encountered an analytics workload that was a really good fit for Citus.

But we had one problem: the percentile calculations on their data (over 300 TB of data) could not meet their SLA of 30 seconds.

To make things worse, the query performance was not even close to the target: the percentile calculations were taking about 6 minutes instead of the required 30 second SLA.

Keep reading

The last two months, I managed the agenda for our weekly Citus team meeting, the one time each week where our entire distributed team—with people spread across 6 different countries—gets together to talk about Citus things. As I chatted with our PostgreSQL folks to find speakers to give 10-minute “lightning talks”, I heard a chorus from several of the engineers: “see if you can get Joe to give a talk. His talks are always super interesting.”

I succeeded. Joe Nelson (known as begriffs online) did deliver a talk titled “Dominus SQL, lord of my domain.” And the engineers liked it. Not a surprise, as Joe’s content tends to be pretty popular, both on his personal blog, and on the Citus Data blog, including high traffic posts such as 5 ways to paginate in Postgres and Faster PostgreSQL Counting.

And when Joe agreed to let me interview him about his work on the Citus documentation (he’s quite busy so I wasn’t sure he would say yes), well, I was thrilled. This post is an edited transcript of my interview with Joe—and it’s your inside baseball view into how the documentation for the Citus open source project gets made.

Keep reading
Ozgun Erdogan

Microsoft Azure Welcomes PostgreSQL Committers

Written byBy Ozgun Erdogan | March 3, 2020Mar 3, 2020

Interview with the Postgres committers who have joined the Postgres team at Microsoft by Sudhakar Sannakkayala (Partner Director, Azure Data) and Ozgun Erdogan (Principal, Azure Data)—cross-posted from the Azure Database for PostgreSQL Blog.

In recent years, the data landscape has seen strong innovation as a result of the onset of open source technologies. At the forefront, PostgreSQL has shown that it’s the open source database built for every type of developer. By staying true to its principles of being standards-compliant, highly programmable, and extensible, PostgreSQL has solidified its position as the “most loved database” of developers across the board—ranging from scenarios for OLTP, analytics, and business intelligence to processing various formats of geometric data using the PostGIS extension.

Keep reading

Page 1 of 2