Application architecture for multi-site manufacturing

Pankaj Jangid

unread,

Aug 19, 2019, 7:44:40 AM8/19/19

to

I am designing an application for a manufacturing company. And I have
decided to go with an open source implementation from top to bottom. And
obviously, PostgreSQL is one of the choices that I have made due to many
reasons. Most important being its ability to handle large amounts of
data, speed, and reliability.

# The Application

Each manufacturing unit produces its own data and is totally unrelated
to the data from other sites. I have tried to repeatedly ask the site
owners whether this will remain the case in future or it might change. I
gave them examples that a part manufactured in one unit is used in
other modules produced in a different unit. As of now they are saying
that this case won't occur. But I want to take care of this if they come
up with this sort of change 3-4 years after implementation. Now the
scenarios is like this:

Manufacturing units: A, B, C
Nodes at unit A: NA1, NA2, NA3, ..., NA100
Nodes at unit B: NB1, NB2, NB3, ..., NB100
Nodes at unit C: NC1, NC2, NC3, ..., NC100

At the hardware-level, I am thinking of this architecture:

Each node at a unit connects to local in-memory Redis cache. Redis nodes
are configured as slaves and a master node sits centrally at the
manufacturing unit. An application will keep on clearing stale data
from the master Redis and push it to a PostgreSQL running at the site. I
also plan to keep a replication server also for PostgreSQL at each site.

The company CXOs want to view consolidated data also. So I want to
connect all these units via some sort of mechanism provided by
PostgreSQL. For example partitioning.

I want to get advice from the PostgreSQL community on this. I have not
done multi-site projects in the past. So before going ahead I want to
have advice on a reasonably futureproof architecture. Any help is
appreciated.

Regards.

--
Pankaj
Planet Earth.

Christian Barthel

unread,

Sep 15, 2019, 2:12:03 PM9/15/19

to

Hello,

Pankaj Jangid <pankaj...@gmail.com> writes:

> I am designing an application for a manufacturing company. And I have
> decided to go with an open source implementation from top to bottom. And
> obviously, PostgreSQL is one of the choices that I have made due to many
> reasons. Most important being its ability to handle large amounts of
> data, speed, and reliability.

That is not such a bad decision, I guess.

[..]

> I want to get advice from the PostgreSQL community on this. I have not
> done multi-site projects in the past. So before going ahead I want to
> have advice on a reasonably futureproof architecture. Any help is
> appreciated.

I can not comment much on the above because I do not have a lot
of experience with designing de-central, big systems(*). I would
try to find a simple way of dealing with the existing demands and
improve the existing solution incrementally. Is a Redis cache
really necessary? If you have a local copy of the data on each
site, the performance of PostgreSQL may be good enough? Can you
add performance improvements later? Can you make it modular and
extendable for future demands etc.

Apart from that, more experienced people might be available on
the mailinglist at https://lists.postgresql.org/. There is a lot
more conversation going on.

(*) It is more or less a reply to let you know that there is
at least one person who still reads this group.
--
Christian Barthel <b...@online.de>

Pankaj Jangid

unread,

Sep 15, 2019, 4:32:03 PM9/15/19

to

Christian Barthel <b...@online.de> writes:

> I can not comment much on the above because I do not have a lot
> of experience with designing de-central, big systems(*). I would
> try to find a simple way of dealing with the existing demands and
> improve the existing solution incrementally. Is a Redis cache
> really necessary? If you have a local copy of the data on each
> site, the performance of PostgreSQL may be good enough? Can you
> add performance improvements later? Can you make it modular and
> extendable for future demands etc.
>

This is a really practicle advice. Thanks. Given the timeline of this
project I wanted some suggestion to deal with the complexity. And this
is really a breather. I can certainly apply incremental improvement.

> Apart from that, more experienced people might be available on
> the mailinglist at https://lists.postgresql.org/. There is a lot
> more conversation going on.
>

Yes. That would be more appropriate I also guess.

> (*) It is more or less a reply to let you know that there is
> at least one person who still reads this group.

I really appreciate your reply. Felt so lonely after posting my
query. Your reply is a hope.

Regards.
--
Pankaj Jangid

Christian Barthel

unread,

Sep 17, 2019, 3:11:23 PM9/17/19

to

Pankaj Jangid <pankaj...@gmail.com> writes:

> This is a really practicle advice. Thanks. Given the timeline of this
> project I wanted some suggestion to deal with the complexity. And this
> is really a breather. I can certainly apply incremental improvement.

One further thing I noticed during the last few months on a
personal project: if you have time left, you may also think about
how and where to add debugging and logging code.

Making state transition explicit and visible can help
tremendously when something does not operate "as intended".
I develop some code in PL/pgSQL (triggers and database
functions) and added small if/else branches with RAISE NOTICE
messages. I can turn on and off such messages with a
configuration table in the database and see how various functions
operate along the way.

>> (*) It is more or less a reply to let you know that there is
>> at least one person who still reads this group.
>
> I really appreciate your reply. Felt so lonely after posting my
> query. Your reply is a hope.

Sure! If you are further interested in reporting about the
progress, your decisions or tools, please feel free to report
it here. Your project sounds interesting and maybe, others have
some further hints or tips when reading more about your project.

--
Christian Barthel <b...@online.de>

Mladen Gogala

unread,

Sep 21, 2019, 11:22:39 AM9/21/19

to

On Mon, 16 Sep 2019 02:02:01 +0530, Pankaj Jangid wrote:

> Christian Barthel <b...@online.de> writes:
>
>> I can not comment much on the above because I do not have a lot
>> of experience with designing de-central, big systems(*). I would
>> try to find a simple way of dealing with the existing demands and
>> improve the existing solution incrementally. Is a Redis cache
>> really necessary? If you have a local copy of the data on each
>> site, the performance of PostgreSQL may be good enough? Can you
>> add performance improvements later? Can you make it modular and
>> extendable for future demands etc.
>>
>
> This is a really practicle advice. Thanks. Given the timeline of this
> project I wanted some suggestion to deal with the complexity. And this
> is really a breather. I can certainly apply incremental improvement.

You should be aware of the CAP theorem which says that you cannot have
consistency, availability and partition tolerance at the same time. You
can only have 2 of those things. Second, there is no need to re-invent
the wheel. There are open source ERP systems in the wild, like Dolibarr
or ERPNext. Since they are open source, you can use their data model to
build your own system, Business case for ERP software is usually
centralization of resource management and the savings that are the
result of central resource management. Your design choice of going
distributed seems to contradict the usual wisdom of centralizing the
ERP systems. I don't say that it cannot work, but it may prove
difficult.Last but not least: you should be aware that this is a
massive undertaking and that you will need many programmers to complete
the task. The only way to keep costs under control is to use Java ORM
and an application server. Many people know how to use Java and JBoss
and programmers are relatively easy to come by. All standard ORMs like
Hibernate, MyBatis and Apache Cayenne support PostgreSQL.

--
Mladen Gogala
Database Consultant
http://mgogala.byethost5.com

Pankaj Jangid

unread,

Sep 22, 2019, 9:39:14 AM9/22/19

to

Mladen Gogala <gogala...@gmail.com> writes:

> You should be aware of the CAP theorem which says that you cannot have
> consistency, availability and partition tolerance at the same time. You
> can only have 2 of those things. Second, there is no need to re-invent
> the wheel. There are open source ERP systems in the wild, like Dolibarr
> or ERPNext. Since they are open source, you can use their data model to
> build your own system, Business case for ERP software is usually
> centralization of resource management and the savings that are the
> result of central resource management. Your design choice of going
> distributed seems to contradict the usual wisdom of centralizing the
> ERP systems. I don't say that it cannot work, but it may prove
> difficult.Last but not least: you should be aware that this is a
> massive undertaking and that you will need many programmers to complete
> the task. The only way to keep costs under control is to use Java ORM
> and an application server. Many people know how to use Java and JBoss
> and programmers are relatively easy to come by. All standard ORMs like
> Hibernate, MyBatis and Apache Cayenne support PostgreSQL.

I'll be running the system in a manufacturing scenario. Consistency and
availability are highly desirable in this case. Various sensors will be
setting state in the manufacturing process and every stage of product
must be aware of the correct states all the time.

Finally I decided to create a centralized database. I am using Rust with
Diesel ORM. It runs really fast on Raspberry Pi attached to the sensors.

--
Pankaj Jangid