I'm guessing you are here because you have already worked this out for yourself. Well, don't worry! There is a solution. We just have to go a little old school.

generate_series() is a set returning function (that's a fancy way to say a function that returns more than one value). It is provided by PostgreSQL to generate a numbers that can be used in queries. Here is an example:

```
SELECT * FROM generate_series(2,4);
generate_series
-----------------
2
3
4
(3 rows)
```

Redshift supports many of the features of PostgreSQL around the 8.x version, but there are several things that it specifically does not support. Unfortunately generate_series(), or rather, any non-scalar function is not supported.

The solution is to create a number table that consists of all the numbers from 0 to some large number. The catch is that the maximum number should be greater than any number you expect to use in a series.

```
CREATE TABLE IF NOT EXISTS number (
number INTEGER NOT NULL
) DISTSTYLE ALL SORTKEY (number);
```

Create just over a thousand numbers (before you start to do the math in your head, just hold on a few more paragraphs):

```
INSERT INTO number VALUES (1), (2), (3), (4), (5), (6), (7), (8);
INSERT INTO number SELECT number + 8 FROM number;
INSERT INTO number SELECT number + 16 FROM number;
INSERT INTO number SELECT number + 32 FROM number;
INSERT INTO number SELECT number + 64 FROM number;
INSERT INTO number SELECT number + 128 FROM number;
INSERT INTO number SELECT number + 256 FROM number;
INSERT INTO number SELECT number + 512 FROM number;
```

Using this method we double the size of the table with each INSERT. As long as the addition numbers double each time you can go as large as you desire. I needed at least 500k+ but I left out all those extra steps for brevity.

If you also need 0 (a really good idea) add it after all the other numbers have been generated:

`INSERT INTO number VALUES (0);`

Back to your brain doing the math. The result number of records (not including the zero) will be double the greatest addition number. So, 1024 for the SQL above.

To prove that all numbers are correct and are in sequence you can use the following formula:

```
n(n + 1)
1 + 2 + 3 ... + n = --------
2
```

Check that total and expected_total are the same:

```
SELECT
SUM(number::BIGINT) AS total,
(MAX(number::BIGINT) * (MAX(number::BIGINT) + 1)) / 2 AS expected_total
FROM number;
```

Usage

The original generate_series query above can be written as:

`SELECT * FROM numbers WHERE number BETWEEN 2 AND 4;`

Sure, this is less than spectacular when used so trivially... I'll leave the magic to you.

comments powered by Disqus