How to generate dummy data in postgres?
learn how to generate large volume of dummy data in postgres with generate_series utility
Lets create an employee table with id, name and salary columns and load some data,
Create a dummy table
CREATE TABLE public.employee (
id int8 NOT NULL,
name varchar(120) NOT NULL,
salary int8 NOT NULL,
CONSTRAINT emp_pk PRIMARY KEY (id)
);
Generate dummy data
Below sql query generates 1 Million employee records,
WITH salary_list AS (
SELECT '{1000, 2000, 5000}'::INT[] salary
)
INSERT INTO public.employee
(id, name, salary)
SELECT n, 'Employee ' || n as name, salary[1 + mod(n, array_length(salary, 1))]
FROM salary_list, generate_series(1, 1000000) as n
Lets try to decouple the query,
salary_list
WITH salary_list AS (
SELECT '{1000, 2000, 5000}'::INT[] salary
)
This is a common table expression, which returns an integer array of salaries. we can use this salary column within this query.
generate_series
generate_series is a postgres function which takes a min and max number and generates values, an example below
select n from generate_series(1, 10) as n
'Employee ' || n as name
generates employee names with their id as suffix.
salary[1 + mod(n, array_length(salary, 1))]
this picks up salary based on n value returned from generate series.
So the final output that we will get is below,
id | name | salary |
1 | Employee 1 | 2000 |
2 | Employee 2 | 5000 |
3 | Employee 3 | 1000 |
4 | Employee 4 | 2000 |
5 | Employee 5 | 5000 |
Hope you liked this trick to import random data into postgres. please follow this blog to get more postgres tips.