How to generate dummy data in postgres?
learn how to generate large volume of dummy data in postgres with generate_series utility
Lets create an employee table with id, name and salary columns and load some data,
Create a dummy table
CREATE TABLE public.employee (
id int8 NOT NULL,
name varchar(120) NOT NULL,
salary int8 NOT NULL,
CONSTRAINT emp_pk PRIMARY KEY (id)
);
Generate dummy data
Below sql query generates 1 Million employee records,
WITH salary_list AS (
SELECT '{1000, 2000, 5000}'::INT[] salary
)
INSERT INTO public.employee
(id, name, salary)
SELECT n, 'Employee ' || n as name, salary[1 + mod(n, array_length(salary, 1))]
FROM salary_list, generate_series(1, 1000000) as n
Lets try to decouple the query,
salary_list
WITH salary_list AS (
SELECT '{1000, 2000, 5000}'::INT[] salary
)
This is a common table expression, which returns an integer array of salaries. we can use this salary column within this query.
generate_series
generate_series is a postgres function which takes a min and max number and generates values, an example below
select n from generate_series(1, 10) as n
'Employee ' || n as name
generates employee names with their id as suffix.
salary[1 + mod(n, array_length(salary, 1))]
this picks up salary based on n value returned from generate series.
So the final output that we will get is below,
id | name | salary |
1 | Employee 1 | 2000 |
2 | Employee 2 | 5000 |
3 | Employee 3 | 1000 |
4 | Employee 4 | 2000 |
5 | Employee 5 | 5000 |
Hope you liked this trick to import random data into postgres. please follow this blog to get more postgres tips.
Did you find this article valuable?
Support Suresh Kumar by becoming a sponsor. Any amount is appreciated!