alphaspirit - Fotolia

Percona users detail open source database challenges

Speakers from Shopify and LinkedIn outline open source database cloud and security challenges and best practices at the Percona Live Online conference.

The business networking platform LinkedIn uses MySQL extensively as a back-end data store for both internal and public-facing assets.

LinkedIn has a centralized MySQL site reliability engineering (SRE) team that provides MySQL as a managed service inside of the company, which uses about 2,300 MySQL databases currently.

LinkedIn engineer Karthik Appigatla, during a technology keynote session Wednesday at the 24-hour Percona Live Online conference, outlined how the business networking site has managed to scale and secure its MySQL deployment.

"We have a lot of microservices and each microservice has its own database," Appigatla said.

Security is a primary responsibility for LinkedIn's MySQL SRE team and Appigatla detailed multiple steps that LinkedIn takes to help reduce risk.

User access management is tightly controlled, with strong passwords automatically generated for users. Going a step further, only users from certain whitelisted IP addresses can access specific databases, and LinkedIn maintains a full audit system to see who accesses what information.

All logs from the MySQL database deployments are sent to a centralized server to audit all queries as well. LinkedIn has developed its own query analyzer for MySQL, which it plans to make open source in the near future, that looks at the queries and can identify potential risks, Appigatla said.

"We get complete information on each and every query that is hitting our databases," he said. "We have information like when the query is first fired from which user and which IP address and how much time it takes each query to execute."

View of MySQL deployment at LinkedIn
MySQL is widely deployed at LinkedIn across a multi-tenant architecture

Shopify's challenges of deploying MySQL in the public cloud

Meanwhile, e-commerce platform vendor Shopify has seen firsthand some of the problems when deploying database services in the cloud. The Ottawa-based vendor deploys its fleet of MySQL services on the Google Cloud Platform at large scale.

Shopify engineers Akshay Suryawanshi and Jeremy Cole, outlined some of the challenges they faced with cloud deployment during a technology keynote session at the Percona conference on May 19.

Suryawanshi noted that Shopify is used by more than a million merchants during the peak Black Friday through Cyber Monday shopping period (Nov. 29 to Dec. 2 in 2019) and it can handle hundreds of millions of queries across its MySQL infrastructure.

A key promise of the cloud is the concept of elastic scalability that enables users to start up new servers on demand to handle traffic. Cole noted that sometimes the instant, on-demand promise doesn't actually always work out as expected.

Shopify has experienced what are known as " stockouts," a situation in which Shopify requested a virtual compute resource from the cloud but the cloud provider didn't immediately have the resource available.

Stockouts are a real thing that actually happen. We may not be able to allocate the resources that we want at all times.
Jeremy ColeSenior staff production engineer, LinkedIn

"Stockouts are a real thing that actually happen. We may not be able to allocate the resources that we want at all times," Cole said. "The cloud does have some resources available generally, they just aren't always immediately available."

As such, Cole recommended that when it comes to disaster recovery, it's not a good idea to rely on resources that are provisioned on demand. Rather, he suggested that for disaster recovery the required virtual resources should always be running, to help limit the risk of any downtime.

The risk of stockouts can also be minimized by choosing smaller-sized virtual resources. Cole noted that Shopify currently makes use of some large compute instances, which can often be less available than smaller resources.

"The broader set of machines that are available for allocation the better and the smaller instances the better, because they're been packed onto physical machines," Cole said. "So, the larger size you choose, the less schedulable it is."

More enterprises using open source databases

Open source database use is a growing trend, according to Percona, based in Raleigh, N.C.

Percona provides supported versions of multiple open source database platforms, including MySQL.

During the COVID-19 pandemic, the use of open source databases, which had already been growing, has accelerated, according to Peter Zaitsev, CEO and Founder of Percona. But even with the fast-growing popularity of open source databases, users still face an array of cloud deployment and security challenges.

"We have a pandemic upon us and while it is incredibly tragic, I think that it can have a positive effect on the adoption of open source software," Zaitsev said during his opening keynote on May 19. "A lot of people are pushed to accelerate digital transformation, bringing more services online than before and it needs to be done at a lower cost due to the economic slowdown."

Increasingly, Zaitsev said that it's easier for organizations to choose a database as a service (DBaaS) approach, in which the database is managed by a provider. While DBaaS is an easy way to get started with a database, Zaitsev argued that it also presents problems.

"Developers and end users choose database services without the supervision of folks that really understand the databases," Zaitsev said. "That can cause ... a variety of bad outcomes, ranging from security leaks to very inefficient delivery of the database services."

Next Steps

Pandemic triggered data security movement to DBaaS

6 common problems with open source code integration

Dig Deeper on Database management