Formatting code with CLion in PostgreSQL style

313 views
Skip to the first unread message

Ben Christel

unread,
23 Oct 2017, 20:26:2923/10/2017
to Greenplum Developers, Amil Khanzada
Hi all,

We are using CLion as our IDE for working with Greenplum code and saw this in the README.md:

When it comes to C and C++ parts of Greenplum, we try to follow
[PostgreSQL Coding Conventions](https://www.postgresql.org/docs/devel/static/source.html).
In addition to that we require that:
   * All Python code passes [Pylint](https://www.pylint.org/)
   * All Go code is formatted according to [gofmt](https://golang.org/cmd/gofmt/)


We'd like to use CLion's auto-formatting feature to enforce the correct style, or, failing that, integrate with a command line tool that will format the code properly (e.g. by configuring CLion to call that tool any time a file is modified). Has anyone done this?

We think we can configure CLion's style preferences to match the PostgreSQL conventions but it would be tedious to set it up manually, so we're hoping for a faster solution.

Also, we are aware of emacs.samples and vim.sample in src/tools/editors/.

- Ben and Amil

Mike Roth

unread,
24 Oct 2017, 10:31:4924/10/2017
to Ben Christel, Greenplum Developers, Amil Khanzada

I don’t know if this helps but there is a pg_indent tool that will reformat the code to Postgres’ coding style. We typically only run it just before major release (as does Postgres).

Ben Christel

unread,
24 Oct 2017, 17:35:1224/10/2017
to Mike Roth, Greenplum Developers, Amil Khanzada
Thanks, Mike! We found it:

We tried running pgindent on OSX El Capitan (10.11.6) and got errors from indent:
→ src/tools/pgindent/pgindent src/tools/pgindent/typedefs.list ./src/backend/utils/resgroup/resgroup.c ./src/backend/utils/resgroup/test/resgroup_test.c
./src/backend/utils/resgroup/test/resgroup_test.c
src/tools/pgindent/pgindent: line 152: /usr/bin/indent: Argument list too long

Do you know if the pgindent script is intended to work on Mac?
Could you please let us know how GPDB runs it before a major release?


-Ben and Amil

Ning Yu

unread,
24 Oct 2017, 21:43:2024/10/2017
to Ben Christel, Mike Roth, Greenplum Developers, Amil Khanzada
Looks like the max command line length (`getconf ARG_MAX`) is too small on mac by default. On my box, it's 262144 on mac and 2097152 on a centos 7.3 VM.

One quick workaround is to change L152 in pgindent from

```
    $INDENT -bad -bap -bc -bl -d0 -cdb -nce -nfc1 -di12 -i4 -l79 \
        -lp -nip -npro -bbb $EXTRA_OPTS \
        `egrep -v '^(FD_SET|date|interval|timestamp|ANY)$' "$TYPEDEFS" | sed -e '/^$/d' -e 's/.*/-T& /'` \
        /tmp/$$a >/tmp/$$ 2>&1
```

to sth. like

```
    ( egrep -v '^(FD_SET|date|interval|timestamp|ANY)$' "$TYPEDEFS" \
      | sed -e '/^$/d' -e 's/.*/-T& /'
      echo /tmp/$$a
    ) | xargs $INDENT -bad -bap -bc -bl -d0 -cdb -nce -nfc1 -di12 -i4 -l79 \
            -lp -nip -npro -bbb $EXTRA_OPTS \
        >/tmp/$$ 2>&1
```

xargs will automatically split the arguments into several short groups and call indent on them group by group, so ARG_MAX won't be exceeded.

Ashwin Agrawal

unread,
25 Oct 2017, 00:18:2425/10/2017
to Ning Yu, Ben Christel, Mike Roth, Greenplum Developers, Amil Khanzada

I would highly recommend for all pg_indent runs, better to use it from latest Postgres version the tool, with our committed version of typedefs.list. Remember if you are adding new data structures and all, better to fist update typedefs.list to have them and then run pg_indent. Also, make sure to run only on GPDB specific files, I would say run it for specific file intended or directory. Do not run it on any upstream postgres file as that would cause merge conflicts.

Ming Li

unread,
25 Oct 2017, 01:07:4225/10/2017
to Ashwin Agrawal, Ning Yu, Ben Christel, Mike Roth, Greenplum Developers, Amil Khanzada
Instead of writing doc, it is better to call pg_indent in makefile so that it can auto format each time we compile gpdb, also we can hardcode the specific directories or files easily in that command.

Michael Schubert

unread,
25 Oct 2017, 11:22:1425/10/2017
to Ming Li, Ashwin Agrawal, Ning Yu, Ben Christel, Mike Roth, Greenplum Developers, Amil Khanzada
On Wed, Oct 25, 2017 at 1:07 AM, Ming Li <m...@pivotal.io> wrote:
Instead of writing doc, it is better to call pg_indent in makefile so that it can auto format each time we compile gpdb, also we can hardcode the specific directories or files easily in that command.

It's been discussed before doing that but unfortunately with the upstream merge still happening we'd introduce more trouble than it's worth right now. As the merge progresses further this should become less and less of an issue and at a later date we might be able to do that but for now we have to depend on following the upstream process of running pg_indent at the end of a merge cycle.

Ben Christel

unread,
25 Oct 2017, 19:53:2025/10/2017
to Michael Schubert, Ming Li, Ashwin Agrawal, Ning Yu, Mike Roth, Greenplum Developers, Amil Khanzada
Thanks for the responses, everyone!

We tried Ashwin's suggestion (using the pg_indent from Postgres) and it worked on Mac.
~/workspace/postgres/src/tools/pgindent/pgindent src/tools/pgindent/typedefs.list ./src/backend/main/main.c

However, the format it enforces doesn't seem to match the other GPDB source files we looked at—particularly with regard to the way local variable names are aligned on resgroup.c. We also saw some lines changed when we ran it on main.c:
--- a/src/backend/main/main.c
+++ b/src/backend/main/main.c
@@ -22,7 +22,7 @@
 #include <pwd.h>
 #include <unistd.h>
-#if defined(__alpha) && defined(__osf__)       /* no __alpha__ ? */
+#if defined(__alpha) && defined(__osf__)   /* no __alpha__ ? */
 #include <sys/sysinfo.h>
 #include "machine/hal_sysinfo.h"
 #define ASSEMBLER
@@ -155,7 +155,7 @@ main(int argc, char *argv[])
            puts("postgres (Greenplum Database) " GP_VERSION);
            exit(0);
        }
-       if (strcmp(argv[1], "--catalog-version") == 0 )
+       if (strcmp(argv[1], "--catalog-version") == 0)
        {
            printf(_("Catalog version number:               %u\n"),
                   CATALOG_VERSION_NO);
@@ -189,7 +189,7 @@ main(int argc, char *argv[])
 #endif
    if (argc > 1 && strcmp(argv[1], "--boot") == 0)
-       AuxiliaryProcessMain(argc, argv);       /* does not return */
+       AuxiliaryProcessMain(argc, argv);   /* does not return */
..................................... (more lines see attachment).............................. 
 

Since main.c hasn't been changed since the 5.0.0 release according to Github, we think we're probably doing something wrong.


We also tried out Ning's suggestion on main.c. It produces a slightly smaller diff (~5 hunks), but there are still some lines changed:
diff --git a/src/backend/main/main.c b/src/backend/main/main.c
index 61f7cfce4a..939cfe764e 100644
--- a/src/backend/main/main.c
+++ b/src/backend/main/main.c
@@ -72,8 +72,8 @@ main(int argc, char *argv[])
     * be overwritten in order to set the process title for ps. In such cases
     * save_ps_display_args makes and returns a new copy of the argv[] array.
     *
-    * save_ps_display_args may also move the environment strings to make
-    * extra room. Therefore this should be done as early as possible during
+    * save_ps_display_args may also move the environment strings to make extra
+    * room. Therefore this should be done as early as possible during
     * startup, to avoid entanglements with code that might save a getenv()
     * result pointer.
     */
@@ -155,7 +155,7 @@ main(int argc, char *argv[])
            puts("postgres (Greenplum Database) " GP_VERSION);
            exit(0);
        }
-       if (strcmp(argv[1], "--catalog-version") == 0 )
+       if (strcmp(argv[1], "--catalog-version") == 0)
        {
..................................... (more lines see attachment)..............................


What do you all think is the best approach for formatting our new code?

-Ben and Amil

pgindent_gpdb_main.c.diff
pgindent_pg_latest_main.c.diff

Robert Eckhardt

unread,
25 Oct 2017, 20:48:3525/10/2017
to Ben Christel, Michael Schubert, Ming Li, Ashwin Agrawal, Ning Yu, Mike Roth, Greenplum Developers, Amil Khanzada
In order to look at what the pg_indent output should look like I'd look at:
Where Ashwin ran the code over a fair number of files. Prior to that the formatting in GPDB was not super consistent. 

-- Rob

Heikki Linnakangas

unread,
26 Oct 2017, 03:00:1326/10/2017
to Ben Christel, Michael Schubert, Ming Li, Ashwin Agrawal, Ning Yu, Mike Roth, Greenplum Developers, Amil Khanzada
On 10/26/2017 01:52 AM, Ben Christel wrote:
> Thanks for the responses, everyone!
>
> We tried Ashwin's suggestion (using the pg_indent from Postgres) and it
> worked on Mac.
>
>> ~/workspace/postgres/src/tools/pgindent/pgindent
>> src/tools/pgindent/typedefs.list ./src/backend/main/main.c
>
> However, the format it enforces doesn't seem to match the other GPDB source
> files we looked at—particularly with regard to the way local variable names
> are aligned on resgroup.c. We also saw some lines changed when we ran it on
> main.c:
>
> ...
>
> Since main.c hasn't been changed since the 5.0.0 release according to Github
> <https://github.com/greenplum-db/gpdb/blob/master/src/backend/main/main.c>,
> we think we're probably doing something wrong.

The upstream code we have in the GPDB repository is as it is in the
upstream. That is, it's formatted the way 8.3-era pgindent formatted it,
a long time ago. The pgindent in the GPDB repository is not identical to
the 8.3 PostgreSQL version, however. It was cherry-picked from 9.1,
along with the whole src/tools/ directory, before greenplum was even
open sourced. So if you run the version in the GPDB repository against
upstream code, you might get slightly different results. Things will get
better as we get closer to the head of PostgreSQL.

To be clear, we shouldn't pgindent upstream code in any case, because
that would just cause unnecessary differences vs. upstream, which causes
silly merge conflicts.

When I've run pgindent on GPDB-specific code, like in src/backend/cdb/,
I've used the PostgreSQL v10 of pgindent. Or to be precise, the current
version from the 'master' branch of PostgreSQL. That produces slightly
different results than the 8.3 or 9.1 version. I believe Ashwin did the
same when he ran pgdindent on src/backend/cdb recently. (Would've been
nice to mention which version of pgindent was used, in the commit
message, BTW.)

So the bottom line is that we're not terribly consistent. I don't mind
that too much. As long as the code is formatted pretty close to the
upstream pgindent style, with any pgindent version, that's good enough.
The inconsistency means that we cannot automate it, however. For now,
let's just try to be tidy when writing new code. You can run pgindent
manually on new code you write. And if the existing formatting of a
GPDB-specific file bothers you, feel free to run pgindent on it, and
submit a PR of that.

There were big changes in v10 to pgindent, and those changes make it
easier to install and run it, because you no longer need to compile a
patched version of bsd indent. So I recommend using that, because it's
easier, and that's what me & Ashwin (I believe) have been using too.

- Heikki

Ben Christel

unread,
26 Oct 2017, 12:23:2726/10/2017
to Heikki Linnakangas, Michael Schubert, Ming Li, Ashwin Agrawal, Ning Yu, Mike Roth, Greenplum Developers, Amil Khanzada
Thanks, Heikki. That clears up a lot of the questions I had.

Since we're currently modifying existing (GPDB-specific) files whose formatting mostly doesn't match pgindent v10, I think we'll do this:

- Change CLion's formatting settings to match the most salient rules enforced by pgindent v10
- Format new functions with CLion, but leave existing code unchanged
- Once our functional changes are merged, go back and format the whole file with pgindent v10, and submit a separate PR for that.

-Ben


Ashwin Agrawal

unread,
26 Oct 2017, 12:29:1226/10/2017
to Ben Christel, Heikki Linnakangas, Michael Schubert, Ming Li, Ning Yu, Mike Roth, Greenplum Developers, Amil Khanzada
On Thu, Oct 26, 2017 at 9:23 AM, Ben Christel <bchr...@pivotal.io> wrote:
Thanks, Heikki. That clears up a lot of the questions I had.

Since we're currently modifying existing (GPDB-specific) files whose formatting mostly doesn't match pgindent v10, I think we'll do this:

- Change CLion's formatting settings to match the most salient rules enforced by pgindent v10
- Format new functions with CLion, but leave existing code unchanged
- Once our functional changes are merged, go back and format the whole file with pgindent v10, and submit a separate PR for that.

+1. Yes I had used latest version of pg_indent from postgres master (will specify in commits henceforth).
 

Amil Khanzada

unread,
26 Oct 2017, 13:37:5426/10/2017
to Ashwin Agrawal, Ben Christel, Heikki Linnakangas, Michael Schubert, Ming Li, Ning Yu, Mike Roth, Greenplum Developers
Thanks Heikki, Ashwin, for clarifying things.

It's a bit confusing that gpdb's pgindent is not what is being used to format new gpdb code.

Perhaps we should do one of the below?
  1. Update gpdb's pgindent version to the latest
  2. Modify gpdb's src/tools/pgindent in the below ways:
    1. Update the README to say that developers should use the latest pgindent tool from Postgres + gpdb's typedefs.list. Also might be good to mention what shouldn't be pgindent'ed, i.e. upstream code that would cause merge conflicts.
    2. Remove all files except typedefs.list and README from the directory

Amil and Ben

Heikki Linnakangas

unread,
27 Oct 2017, 03:06:5427/10/2017
to Amil Khanzada, Ashwin Agrawal, Ben Christel, Michael Schubert, Ming Li, Ning Yu, Mike Roth, Greenplum Developers
On 10/26/2017 07:37 PM, Amil Khanzada wrote:
> Thanks Heikki, Ashwin, for clarifying things.
>
> It's a bit confusing that gpdb's pgindent is not what is being used to
> format new gpdb code.
> https://github.com/greenplum-db/gpdb/tree/master/src/tools/pgindent
>
> Perhaps we should do one of the below?
>
> 1. Update gpdb's pgindent version to the latest
> 2. Modify gpdb's src/tools/pgindent in the below ways:
> 1. Update the README to say that developers should use the latest
> pgindent tool from Postgres + gpdb's typedefs.list. Also might be good to
> mention what shouldn't be pgindent'ed, i.e. upstream code that
> would cause
> merge conflicts.
> 2. Remove all files except typedefs.list and README from the directory

Yeah, makes sense.

- Heikki

Amil Khanzada

unread,
27 Oct 2017, 12:58:5027/10/2017
to Heikki Linnakangas, Ashwin Agrawal, Ben Christel, Michael Schubert, Ming Li, Ning Yu, Mike Roth, Greenplum Developers
I propose we do option 2 ("Modify gpdb's src/tools/pgindent in the below ways") then, as option 1 would require more work that needs to be repeated in the future, as new versions of pgindent come out.

Amil Khanzada

unread,
31 Oct 2017, 17:06:1731/10/2017
to Heikki Linnakangas, Ashwin Agrawal, Ben Christel, Michael Schubert, Ming Li, Ning Yu, Mike Roth, Greenplum Developers
We've opened a PR for option 2. Please take a look!

Amil Khanzada

unread,
8 Nov 2017, 15:26:4508/11/2017
to Heikki Linnakangas, Ashwin Agrawal, Ben Christel, Michael Schubert, Ming Li, Ning Yu, Mike Roth, Greenplum Developers
Hi all,


Do you think it's important enough to add a note to the GPDB coding guidelines?
For example, "Please run pgindent on code submission if necessary, by following the code formatting guidelines mentioned in <src/tools/pgindent/README.gpdb>"

Best,
Amil

Amil Khanzada

unread,
9 Nov 2017, 14:41:5909/11/2017
to Heikki Linnakangas, David Sharp, Ashwin Agrawal, Ben Christel, Michael Schubert, Ming Li, Ning Yu, Mike Roth, Greenplum Developers

Ben Christel

unread,
4 Dec 2017, 12:58:2404/12/2017
to Greenplum Developers
Hi all,

In the course of investigating our options for formatting code, we've discovered that CLion's formatting presets allow us to get very close to Postgres style. We've found that a combination of LLDB style with Allman braces works well. The attached screenshots show how to set the style options.

Alignment of variable declarations in columns (i.e. with the type in one column and the the name in another) can be achieved with the "Wrapping and Braces > Variable Groups > Align in columns option". Note however that while Greenplum's style puts the "*" of pointer types just to the left of the name column, CLion pushes the name to the right to accommodate the "*". The second screenshot shows an example of this.


Inline image 1

Inline image 2


Hope this is helpful to anyone else out there who's using CLion to work on Greenplum code.

-Ben

Xin Zhang

unread,
4 Dec 2017, 19:21:3304/12/2017
to Ben Christel, Greenplum Developers
Is that possible to share a generated coding style xml file here to see if that can be used as gpdb.default.xml, so that people can import it?

This is what I got so far following the option above, e.g., do style of 'lldb', then 'allman braces', then check the 'align in columns'.

I have attached the version I have.

Thanks,
Shin

On Mon, Dec 4, 2017 at 9:58 AM Ben Christel <bchr...@pivotal.io> wrote:
Hi all,

In the course of investigating our options for formatting code, we've discovered that CLion's formatting presets allow us to get very close to Postgres style. We've found that a combination of LLDB style with Allman braces works well. The attached screenshots show how to set the style options.

Alignment of variable declarations in columns (i.e. with the type in one column and the the name in another) can be achieved with the "Wrapping and Braces > Variable Groups > Align in columns option". Note however that while Greenplum's style puts the "*" of pointer types just to the left of the name column, CLion pushes the name to the right to accommodate the "*". The second screenshot shows an example of this.


Screen Shot 2017-12-01 at 11.39.58 AM.png

Screen Shot 2017-12-04 at 9.46.53 AM.png
--
Thanks,
Shin
GPDB.Default.xml

Ben Christel

unread,
4 Dec 2017, 21:03:0804/12/2017
to Xin Zhang, Greenplum Developers
Hi Xin,

I tried importing the config file (as "IntelliJ IDEA code style XML"; see attached screenshot) and it seems to work as expected. Thanks!

Inline image 1

-Ben

Xin Zhang

unread,
5 Dec 2017, 12:52:0405/12/2017
to Ben Christel, Greenplum Developers
Wow, wonderful! Thanks for verifying it. Now, we have some good XML to flow around. Btw, in this case, where would be a good place for such xml to reside to? This doesn't below to the GPDB repo directly, but it will definitely help people to consolidate the IDE environment.

Thoughts?
Thanks,
Shin
--
Thanks,
Shin

Ashwin Agrawal

unread,
5 Dec 2017, 12:54:4705/12/2017
to Xin Zhang, Ben Christel, Greenplum Developers
On Tue, Dec 5, 2017 at 9:51 AM, Xin Zhang <xzh...@pivotal.io> wrote:
Wow, wonderful! Thanks for verifying it. Now, we have some good XML to flow around. Btw, in this case, where would be a good place for such xml to reside to? This doesn't below to the GPDB repo directly, but it will definitely help people to consolidate the IDE environment.

It can belong here, in repo
gpdb/src/tools/editors

Ben Christel

unread,
5 Dec 2017, 14:40:4705/12/2017
to Ashwin Agrawal, Xin Zhang, Greenplum Developers
We've fixed up the XML file and opened a PR: https://github.com/greenplum-db/gpdb/pull/4056

Thanks Xin and Ashwin for the suggestions on how to solve this!

-Ben and Amil

Xin Zhang

unread,
6 Dec 2017, 12:35:0506/12/2017
to Ben Christel, Ashwin Agrawal, Greenplum Developers
Started using it. Thanks a lot!
--
Thanks,
Shin
Reply all
Reply to author
Forward
0 new messages