Update Table Data for Uniform Distribution in SQL
On SQL Server database table, if SQL developers want to create equal number of rows or uniform distribution among data categories database programmers can use SQL script given in this tutorial.
The original SQL table data is having a distribution on selected column as follows
After above SQL update statement is executed, table data is now having a uniform distribution on selected column
Unfortunately, I realized that above update is not sufficient in some cases.
Assume that our values are distributed as seen in below numbers in our sample database table.
As you will realize, I had a total number of 100 rows at first sample.
Then for following case, I had introduced two new groups of rows in low numbers.
5 new rows for category 400 and 5 rows for category group 500
After we execute above code, the updated data will have below count values.
Since the count difference between minimum group (with 5 rows) and target distribution count (100+5+5)/5 = 22 is bigger than minimum group count (5), the distribution is not uniform even after first correction.
An other attemp to make numbers equal among different categories or groups by running the same script once more will improve the result
If not satistied with the result, let's try again. Execute the exact SQL script again. And then check the final situation on value distribution on table data.
Yes now it is all uniformly distributed :)