Saturday, June 22, 2024

A Small Partition – The Daily WTF

Programming LanguageA Small Partition - The Daily WTF


Once upon a time, I was tuning a database performance issue. The backing database was an Oracle database, and the key problem was simply that the data needed to be partitioned. Great, easy, I wrote up a change script, applied it to a test environment, gathered some metrics to prove that it had the effects we expected, and submitted a request to apply it to production.

And the DBAs came down on me like a sledgehammer. Why? Well, according to our DBAs, the license we had with Oracle didn’t let us use partitioning. The feature wasn’t disabled in any way, but when an Oracle compliance check was performed, we’d get dinged and they’d charge us big bucks for having used the feature- and if we wanted to enable it, it’d cost us $10,000 a year, and no one was willing to pay that.

Now, I have no idea how true this actually was. I have no reason to disbelieve the DBAs I was working with, but perhaps they were being overly cautious. But the result is that I had to manually partition the data into different tables. The good news was all the writes always went into the most recent table, almost all of the reads went to either the current table or last month’s table, and everything else was basically legacy and while it might be used in a report, if it was slower than the pitch drop experiment, that was fine.

It was stupid, and it sucked, but it wasn’t the worst sin I’d ever committed.

Which is why I have at least some sympathy for this stored procedure, found by Ayende.

ALTER PROCEDURE GetDataForDate
   @date DATETIME
AS
   DECLARE @sql nvarchar(max)
   SET @sql = 'select * from data_' + convert(nvarchar(30),getdate(),112)
   EXEC sp_executesql @sql

Now, this is for an MS SQL database, which does not have any weird licensing around using partitions. But we can see here the manual partitioning in use.

There are a set of data_yyyymmdd tables. When we call this function, it takes the supplied date and writes a query specific to that table. This means that there is a table for every day.

Ayende got called in for this because one of the reports was running slowly. This report simply… used all of the tables. It just UNIONed them together. This, of course, removed any benefit of partitioning, and didn’t exactly make the query planning engine’s job easy, either. The execution paths it generated were not terribly efficient.

At the time Ayende first found it, there were 750 tables. And obviously, as each day ticked past, a new table was created. And yes, someone manually updated the view, every day.

Ayende sent this to us many tables ago, and I dread to think how many tables are yet to be created.

[Advertisement]
Continuously monitor your servers for configuration changes, and report when there’s configuration drift. Get started with Otter today!

Check out our other content

Check out other tags:

Most Popular Articles