70.2. System Catalog Initial Data

70.2. System Catalog Initial Data
Prev	Up	Chapter 70. System Catalog Declarations and Initial Contents	Home	Next

70.2.1. Data File Format
70.2.2. OID Assignment
70.2.3. OID Reference Lookup
70.2.4. Recipes for Editing Data Files

Each catalog that has any manually-created initial data (some do not) has a corresponding .dat file that contains its initial data in an editable format.

70.2.1. Data File Format

Each .dat file contains Perl data structure literals that are simply eval'd to produce an in-memory data structure consisting of an array of hash references, one per catalog row. A slightly modified excerpt from pg_database.dat will demonstrate the key features:

[

# A comment could appear here.
{ oid => '1', oid_symbol => 'TemplateDbOid',
  descr => 'database\'s default template',
  datname => 'template1', datdba => 'PGUID', encoding => 'ENCODING',
  datcollate => 'LC_COLLATE', datctype => 'LC_CTYPE', datistemplate => 't',
  datallowconn => 't', datconnlimit => '-1', datlastsysoid => '0',
  datfrozenxid => '0', datminmxid => '1', dattablespace => '1663',
  datacl => '_null_' },

]

Points to note:

70.2.2. OID Assignment

A catalog row appearing in the initial data can be given a manually-assigned OID by writing an oid => nnnn metadata field. Furthermore, if an OID is assigned, a C macro for that OID can be created by writing an oid_symbol => name metadata field.

Pre-loaded catalog rows must have preassigned OIDs if there are OID references to them in other pre-loaded rows. A preassigned OID is also needed if the row's OID must be referenced from C code. If neither case applies, the oid metadata field can be omitted, in which case the bootstrap code assigns an OID automatically, or leaves it zero in a catalog that has no OIDs. In practice we usually preassign OIDs for all or none of the pre-loaded rows in a given catalog, even if only some of them are actually cross-referenced.

Writing the actual numeric value of any OID in C code is considered very bad form; always use a macro, instead. Direct references to pg_proc OIDs are common enough that there's a special mechanism to create the necessary macros automatically; see src/backend/utils/Gen_fmgrtab.pl. Similarly — but, for historical reasons, not done the same way — there's an automatic method for creating macros for pg_type OIDs. oid_symbol entries are therefore not necessary in those two catalogs. Likewise, macros for the pg_class OIDs of system catalogs and indexes are set up automatically. For all other system catalogs, you have to manually specify any macros you need via oid_symbol entries.

To find an available OID for a new pre-loaded row, run the script src/include/catalog/unused_oids. It prints inclusive ranges of unused OIDs (e.g., the output line “45-900” means OIDs 45 through 900 have not been allocated yet). Currently, OIDs 1-9999 are reserved for manual assignment; the unused_oids script simply looks through the catalog headers and .dat files to see which ones do not appear. You can also use the duplicate_oids script to check for mistakes. (genbki.pl will also detect duplicate OIDs at compile time.)

The OID counter starts at 10000 at the beginning of a bootstrap run. If a catalog row is in a table that requires OIDs, but no OID was preassigned by an oid field, then it will receive an OID of 10000 or above.

70.2.3. OID Reference Lookup

Cross-references from one initial catalog row to another can be written by just writing the preassigned OID of the referenced row. But that's error-prone and hard to understand, so for frequently-referenced catalogs, genbki.pl provides mechanisms to write symbolic references instead. Currently this is possible for references to access methods, functions, operators, opclasses, opfamilies, and types. The rules are as follows:

Use of symbolic references is enabled in a particular catalog column by attaching BKI_LOOKUP(lookuprule) to the column's definition, where lookuprule is pg_am, pg_proc, pg_operator, pg_opclass, pg_opfamily, or pg_type. BKI_LOOKUP can be attached to columns of type Oid, regproc, oidvector, or Oid[]; in the latter two cases it implies performing a lookup on each element of the array.
In such a column, all entries must use the symbolic format except when writing 0 for InvalidOid. (If the column is declared regproc, you can optionally write - instead of 0.) genbki.pl will warn about unrecognized names.
Access methods are just represented by their names, as are types. Type names must match the referenced pg_type entry's typname; you do not get to use any aliases such as integer for int4.
A function can be represented by its proname, if that is unique among the pg_proc.dat entries (this works like regproc input). Otherwise, write it as proname(argtypename,argtypename,...), like regprocedure. The argument type names must be spelled exactly as they are in the pg_proc.dat entry's proargtypes field. Do not insert any spaces.
Operators are represented by oprname(lefttype,righttype), writing the type names exactly as they appear in the pg_operator.dat entry's oprleft and oprright fields. (Write 0 for the omitted operand of a unary operator.)
The names of opclasses and opfamilies are only unique within an access method, so they are represented by access_method_name/object_name.
In none of these cases is there any provision for schema-qualification; all objects created during bootstrap are expected to be in the pg_catalog schema.

genbki.pl resolves all symbolic references while it runs, and puts simple numeric OIDs into the emitted BKI file. There is therefore no need for the bootstrap backend to deal with symbolic references.

70.2.4. Recipes for Editing Data Files

Here are some suggestions about the easiest ways to perform common tasks when updating catalog data files.

Add a new column with a default to a catalog: Add the column to the header file with a BKI_DEFAULT(value) annotation. The data file need only be adjusted by adding the field in existing rows where a non-default value is needed.

Add a default value to an existing column that doesn't have one: Add a BKI_DEFAULT annotation to the header file, then run make reformat-dat-files to remove now-redundant field entries.

Remove a column, whether it has a default or not: Remove the column from the header, then run make reformat-dat-files to remove now-useless field entries.

Change or remove an existing default value: You cannot simply change the header file, since that will cause the current data to be interpreted incorrectly. First run make expand-dat-files to rewrite the data files with all default values inserted explicitly, then change or remove the BKI_DEFAULT annotation, then run make reformat-dat-files to remove superfluous fields again.

Ad-hoc bulk editing: reformat_dat_file.pl can be adapted to perform many kinds of bulk changes. Look for its block comments showing where one-off code can be inserted. In the following example, we are going to consolidate two boolean fields in pg_proc into a char field:

Add the new column, with a default, to pg_proc.h:

+    /* see PROKIND_ categories below */
+    char        prokind BKI_DEFAULT(f);

Create a new script based on reformat_dat_file.pl to insert appropriate values on-the-fly:

-           # At this point we have the full row in memory as a hash
-           # and can do any operations we want. As written, it only
-           # removes default values, but this script can be adapted to
-           # do one-off bulk-editing.
+           # One-off change to migrate to prokind
+           # Default has already been filled in by now, so change to other
+           # values as appropriate
+           if ($values{proisagg} eq 't')
+           {
+               $values{prokind} = 'a';
+           }
+           elsif ($values{proiswindow} eq 't')
+           {
+               $values{prokind} = 'w';
+           }

Run the new script:
```
$ cd src/include/catalog
$ perl  rewrite_dat_with_prokind.pl  pg_proc.dat
```
At this point pg_proc.dat has all three columns, prokind, proisagg, and proiswindow, though they will appear only in rows where they have non-default values.

Remove the old columns from pg_proc.h:

-    /* is it an aggregate? */
-    bool        proisagg BKI_DEFAULT(f);
-
-    /* is it a window function? */
-    bool        proiswindow BKI_DEFAULT(f);

Finally, run make reformat-dat-files to remove the useless old entries from pg_proc.dat.

For further examples of scripts used for bulk editing, see convert_oid2name.pl and remove_pg_type_oid_symbols.pl attached to this message: https://www.postgresql.org/message-id/CAJVSVGVX8gXnPm+Xa=DxR7kFYprcQ1tNcCT5D0O3ShfnM6jehA@mail.gmail.com

Prev	Up	Next
70.1. System Catalog Declaration Rules	Home	70.3. BKI File Format