I had three questions about populating the database which I couldn't figure out form the doc: United Manufacturing Hub #umh-support

I had three questions about populating the databas...

Denis

12/04/2024, 2:35 PM

I had three questions about populating the database which I couldn't figure out form the doc: 1) Am I correct to understand that

factoryinsight

is deprecated, and we're expected to use

umh_v2

exclusively? 2) Does populating the 8 tables from

umh_v2

still work via the topic name as follows? * Via the

_analytics

schema: asset work_order, shift, state, product, product_type * Via the

_historian

schema: asset, tag, tag_string 3) Can the predefined database schema be extended with additional tables if my use case requires a more complicated data model? For example, a work order consists of multiple operations? I'm referring to https://umh.docs.umh.app/docs/datamodel/. Is there any other documentation that I'm missing?

Jermuk

12/04/2024, 3:00 PM

creating a thread to better organize

Jermuk

12/04/2024, 3:00 PM

https://cdn.discordapp.com/attachments/1313876222429630525/1313882690734260314/image.png?ex=6751c01c&is=67506e9c&hm=65fe025e532f029f7dcdc20504a7fcdcbc411176419ec863ebfecbaca823043a&

Jermuk

12/04/2024, 3:00 PM

Will take a closer look later, but here are some changes in the bridges: https://learn.umh.app/course/new-bridges-and-automatic-helm-upgrade-in-umh/

Jermuk

12/04/2024, 3:05 PM

1) factoryinsight is deprecated and only for legacy enterprise customers, indeed 2) you can read how this works from the bridges article I linked. you can furthermore see there how the database schema is setup: https://github.com/united-manufacturing-hub/united-manufacturing-hub/blob/e030a22ebf1bf94c3935167c21787130627632f1/deployment/united-manufacturing-hub/templates/bridges/kafka_to_postgres/historian/configmap.yaml#L406-L443 3) i recommend "duplicating" an entire benthos bridge, and then adjusting it with new table names, so that _analytics stays UMH analytics, and _historian stays UMH historian, but you can easily do something like _DenisAnalytics which is like 90% _analytics, but also has custom changes

DanielH

12/04/2024, 8:17 PM

aah.. nice. This is something i will look further into. Much of the data we use today is not needed to be in timeseries data. For some of the machines we save around 70 values on each cycle. This is not used for live data but more for historical data to use if needed to look back on specific machine states. This is why i merge the data to a single json in benthos, to get 1 timestamp for all values. This however breaks when saving it to timescale since each object will get its own timestamp.. I would like a more traditional schema for this data where 1 table would be common for all machines with tags like cycle_running, production_rate_current, production_rate_planned, etc and another table for machine specific tags since we have alot of similar machine groups.. a data bridge for this would be a huge step forward in saving space and time. just publish the data to ex. _machinetype1 and it will push it to the 2 tables.

DanielH

12/04/2024, 8:22 PM

What i also will look into is to have like a _debug or _analysis bridge to save data a predefined time. have some easy setting to enable _debug for 30 day and push multiple machine tags to that topic to have the full history of each sensor but automatically delete data older than 30 days and have a button or something to stop debug and delete all data or that asset.. this would be really good for troubleshooting machine and for evaluating new machine data. For example if you have 4-5 sensors on a machine to calculate when it is running etc. it is quite nice to log all data for a while to see that you calculations are correct without saving a bunch of scrap data to the database.

Denis

12/05/2024, 3:55 PM

_analytics

schema use is no longer recommended (as per the blog article), how will asset work_order, shift, state, product, product_type be created and populated?

Denis

12/05/2024, 4:00 PM

For 3) can you elaborate what you mean with "duplicating"? Do you mean editing the file https://github.com/united-manufacturing-hub/united-manufacturing-hub/blob/e030a22ebf1bf94c3935167c21787130627632f1/deployment/united-manufacturing-hub/templates/bridges/kafka_to_postgres/historian/configmap.yaml#L406-L443 by adding a case for topic names that contain

_DenisAnalytics

and defining the appropriate SQL code to create and populate the tables?

Jermuk

12/05/2024, 6:08 PM

You can still use the old bridges and manually activate them if you want. They are simply not activated by default anymore. For 3): in the management console, you could simply copy and paste the existing bridges and create your own new ones. Use custom data flow components for this.

Denis

12/06/2024, 11:38 AM

I assume the old _analytics bridge is written in Go instead of Benthos. I checked the GitHub repository but failed to find it. Could you point me to the file(s)? In essence, I'd like to reuse most of that logic, including the OEE calculation, but adapt the database schema slightly to my use case

Luke van Enkhuizen

12/06/2024, 11:57 AM

(I also noticed that this is used the latest webinar, as a custom bridge? I'm also not clear about this part at the moment. Would be great if we have some examples how we can reproduce this) https://cdn.discordapp.com/attachments/1313876222429630525/1314561266131406888/CleanShot_2024-12-06_at_12.53.432x.png?ex=67543815&is=6752e695&hm=49ab1843c20b7a278de0fa1a8a8dfea91b602fc93de40c0d816e38f05e411f1c&

Luke van Enkhuizen

12/06/2024, 12:01 PM

Hi Jeremy, how can I reactivate that bridge?

Luke van Enkhuizen

12/06/2024, 12:16 PM

@Denis and I are working on a PoC to demonstrate the value to a metal fabricator. For this we need to create a table that is not work orders, but operations, which is a subset of work orders. I was hoping that we could use 90% of analytics, with an additional table? How should we approach this?

Luke van Enkhuizen

12/06/2024, 12:37 PM

This one, maybe? https://github.com/united-manufacturing-hub/united-manufacturing-hub/blob/staging/golang/cmd/kafka-to-postgresql-v2/postgresql/analytics-work-order.go

Jermuk

12/06/2024, 12:39 PM

@Mateusz @krgrsebastian do you remember what you did there?

Jermuk

12/06/2024, 12:39 PM

yes, this one

Jermuk

12/06/2024, 12:40 PM

i checked with @Ferdinand and by default kafka-to-postgresql shoudl still be running

Luke van Enkhuizen

12/06/2024, 12:46 PM

Thanks @Jermuk appreciate any help here. If we get this to work we'll gladly share our learnings 🙌

Mateusz

12/06/2024, 1:27 PM

@Luke van Enkhuizen We just created a new bridge based on the _historian sample as stated in the blog article and in it deleted the "Step 6" (see picture). "2. Create a new DFC for MQTT to Kafka: Check out the mqtt_to_kafka_historian sample and adjust it for the _analytics schema. " https://learn.umh.app/course/new-bridges-and-automatic-helm-upgrade-in-umh/#for-users-of-the-analytics-schema https://cdn.discordapp.com/attachments/1313876222429630525/1314584063377539093/image.png?ex=67544d50&is=6752fbd0&hm=c5a73bd41acd4466e118ad341fb413145479fffca44d9e4c3f4039bfd13f84f2&

Luke van Enkhuizen

12/06/2024, 5:42 PM

So this would be the way? 1. First create the desired tables in PostgreSQL? 2. Create bridges (based on historian example) https://cdn.discordapp.com/attachments/1313876222429630525/1314648106867888148/CleanShot_2024-12-06_at_17.png?ex=675488f5&is=67533775&hm=2ab2950b8063035cf057f555e828e2ad1fdecf7a9e8eae975bc64b11da1bc014&

Luke van Enkhuizen

12/06/2024, 5:42 PM

In a more complete picture; we could build a custom MES like this? https://cdn.discordapp.com/attachments/1313876222429630525/1314648181216251904/CleanShot_2024-12-06_at_17.png?ex=67548907&is=67533787&hm=2355b18d1a64dacef052b01434b2bc922d0fe91dbebdaa24bbacc8fa259de551&

Jermuk

12/06/2024, 6:06 PM

the benthos bridges even include the table generation, and potentially the migrations 🙂 we specifically designed it so that the entire logic is in a single place, can be versioned, easily duplicated and modified.

Jermuk

12/06/2024, 6:06 PM

so steps 1-3 are together

Jermuk

12/06/2024, 6:06 PM

the rest is correct 🙂

Luke van Enkhuizen

12/06/2024, 6:07 PM

Ah.. even for more complex relations? e.g. like this? https://cdn.discordapp.com/attachments/1313876222429630525/1314654413062934609/CleanShot_2024-12-06_at_17.png?ex=67548ed5&is=67533d55&hm=933c2d714b5bbef1cf0e41e91acac4e911695b7aafb2312777462221578f96ff&

Jermuk

12/06/2024, 6:08 PM

at one point, it would get a little bit messy tbh to have so many tables and logics in a single config file / benthos container. technically possible and our current path though!

Denis

12/06/2024, 6:08 PM

@Luke van Enkhuizen yeah, but the way I currently see it is that we’ll need to create a bridge for every table

Denis

12/06/2024, 6:11 PM

And just filter the topic by name, e.g. umh/v1/myEnterprise/…/originId/_MES/work_order: filter by topic name contains “_MES” and “work_order”

Denis

12/06/2024, 6:13 PM

It could perhaps be done in one file using lots of switches, but like @Jermuk says it would get messy. I’ll think about it tomorrow when the Glühwein wears off 😝

Luke van Enkhuizen

12/06/2024, 6:14 PM

Prost!

Luke van Enkhuizen

12/06/2024, 6:14 PM

Really impressive work. Definitely going to dig into this next days

DanielH

12/07/2024, 10:51 AM

a question about the bridge. Why not use mqtt direct to sql and skip kafka in between? At least when creating a custom bridge and you know that you will only use mqtt?

Jermuk

12/07/2024, 12:49 PM

Because there is a high likelihood that messages will go missing, that was the main reason to add Kafka. See also https://learn.umh.app/blog/tools-techniques-for-scalable-data-processing-in-industrial-iot/

Denis

12/07/2024, 7:26 PM

I got the MQTT to Kafka bridge working. I'd like to delete the old logs to suppress this message. How do I do that? https://cdn.discordapp.com/attachments/1313876222429630525/1315036831049449492/image.png?ex=6755f2fc&is=6754a17c&hm=5b1df8f794a18c6dde568c0f6689b78033a67c12fe7878eb0a208d4390d24d2d&

DanielH

12/07/2024, 7:30 PM

yes. that part i know. however it is not quite so bad as described in the article. You can still use QoS 2 and have message guarantee and also retain. There is also LWT you could use. The case when a mirco service crashes is harder to handle if messages are sent during restart, but for normal machine data this would not be a problem.. I mean if you already are publishing the data to mqtt, what is the point of sending it to kafka before pushing it to the database? it is just an extra step. For the UMH standard bridge i understand that using the bridge the user could choose to publish to mqtt or kafta and everything will work. but if creating a custom bridge i see no value in pushing the data through another step. Or am i missing something else ?

DanielH

12/07/2024, 7:34 PM

nice. As far as i know the logs a are a bit laggy atm. i had the same problem when starting up a new cluster. i had 15-20 error messages and i cleaned all messages in kubernetes but the messages was still visible in the console.. it caches the messages somewhere if i remember correct. There is a plan for improving this.

Denis

12/07/2024, 8:01 PM

I'm now configuring the bridge from Kafka to Timescaledb. I think my YAML code is fine, but I'm getting an error related to versioning. Has anyone had this one before?

Copy code

[8:58:46 PM] - Action confirmed. Preparing to execute action.
[8:58:47 PM] - Editing data flow component.
[8:58:47 PM] - [Step 1/6] Retrieving configuration to get current settings.
[8:58:47 PM] - [Step 1/6] Parsing configuration to extract deployment parameters.
[8:58:47 PM] - [Step 3/4] Adding new data flow component to configuration.
[8:58:47 PM] - Error: failed to execute action: [Step 3/6] Failed to add new data flow component to configuration: refusing to deploy new version, not based on the currently deployed version (BasedOnUUID: 4392ae33-5ecd-42b9-953b-3cd541c3ebcf, DeployedVersion: a0b0cb21-2999-4fc1-9aea-f43c9ae876cd)

Jermuk

12/08/2024, 7:26 AM

For example the Tag Browser in the Management Console incl Message Tracing are not possible. Furthermore, QoS 2 still has message loss. It only guarantees you the delivery to the broker, not to the receiving application. If the consumer is offline during that time, then the standard does not guarantee it anymore and it’s entirely up to the implementation.

Jermuk

12/08/2024, 7:27 AM

It will ignore old errors in the logs automatically either after a duration.

Jermuk

12/08/2024, 7:29 AM

Did you used multiple browser tabs? This error appears when you open for editing, but in the meantime changed the underlying DFC or protocol converter in a different tab, or someone else (in multi user) changed it. This is done to prevent accidental code overwrites, so you need to be on the latest code change before you can edit. Simply re-open the DFC again.

DanielH

12/08/2024, 8:35 AM

so the tag browser uses kafka? Yes, if the consumer is offline you could loose messages, but i mean if you are pushing the messages from a machine using mqtt and using a bridge, then the bridge is the consumer so if the consumer is offline it does not get sent to kafka either. if not minding loosing the tag browser functionality i see no meaning in bridging it to kafta before sql if you machines all connect direct to mqtt. I guess the best would be to only use kafka all the time but that is not possible since mqtt is often the only protocol supported for many devices.

Denis

12/08/2024, 7:09 PM

Confirming that this was indeed the problem

Denis

12/09/2024, 10:01 AM

@Jermuk can you help me understand what the "merge point" is, and why it's supposed to be 3 or 5? It seems to play a role in creating the Redpanda topic name https://cdn.discordapp.com/attachments/1313876222429630525/1315619291919421491/image.png?ex=67581172&is=6756bff2&hm=f7d9792423125947a1d578d1fe3c7bb898290fe7499a7e5befda100ea87f9021&

Jermuk

12/09/2024, 10:05 AM

the point where a mqtt topic gets broken into kafka topic and kafka key (as by defaul kafka doesn ot like having a lot of topics). We already did a successful PoC where we configured kafka in such a way that it is actully possible, but it make take some time until it reaches nightly

Denis

12/09/2024, 10:17 AM

Got it. ~~If I'm creating a custom

_MES

data contract, is there any reason why I can't just give the Redpanda topic the same name as the table in Timescaledb, i.e.

asset

work_order

operation

, etc. ? I don't think I need the feature from the current

_historian

data contract that populates an asset table based on the topic name because we just query the assets from an external system via HTTP~~ My question doesn't make much sense... I'll take some time to think this through

Luke van Enkhuizen

12/09/2024, 1:34 PM

Hi @Jermuk! @Denis and @Diederik and I had a close look at it. But we coulnd't figure out how to get the _analytics working via Benthos yet. Would love to create our own _analytics2. Would you be so kind to help us out and share a copy of the YAML config to mimic this ourselves? We promise to share our insights if we get it to work 😎

Luke van Enkhuizen

12/09/2024, 1:35 PM

We just need a good sample to get the broad lines and then we can finetune it ourselves. Thanks in advance!

Jermuk

12/09/2024, 2:17 PM

i quickly asked @Mateusz and he said they are using the bnethos bridges between mqtt and kafka exactly like in the tutorial, and to insert it into the postgres, usign the old kafka-to-postgresql

Jermuk

12/09/2024, 2:17 PM

i hope that is helpful 🙂

Denis

12/09/2024, 4:40 PM

I'm creating a new Custom Data Flow Component to transfer data from Redpanda to Timescale by editing the kafka_to_postgresql_historian_bridge sample. I have a working kafka topic

umh.v1.Belgium

where I receive messages with the kafka key

Antwerp.Kalmthout.Heide.Begonia._mes.work_order

This is an example message:

Copy code

{
  "actual_end": null,
  "actual_start": null,
  "archived": false,
  "article_id": 12345,
  "created_at": 1733761490051,
  "created_by": "system",
  "external_order_id": "EXT-6925",
  "order_id": 403074,
  "planned_end": "2024-12-10T16:24:50.051Z",
  "planned_quantity": 100,
  "planned_start": "2024-12-09T16:24:50.051Z",
  "status": "pending"
}

The problem is that the

input_received

count under Metrics in the management console stays at zero, even though new messages are successfully being written to the kafka topic. I don't understand why the input node doesn't receive data. I checked that: - I'm using the correct consumer

kafka_to_postgresql_historian_bridge

- the topic name is

umh.v1.Belgium

which should match the wildard - the consumer group

kafka_to_postgresql_historian_bridge

is listed in the list of consumers for that topic in Redpanda Here is the input YAML:

Copy code

input:
  kafka_franz:
    seed_brokers:
      - united-manufacturing-hub-kafka.united-manufacturing-hub.svc.cluster.local:9092
    regexp_topics: true
    topics:
      - umh.v1.*
    consumer_group: kafka_to_postgresql_historian_bridge
    auto_replay_nacks: false

I added a file with the complete bridge YAML ito this message https://cdn.discordapp.com/attachments/1313876222429630525/1315719705318658108/question.yaml?ex=67586ef6&is=67571d76&hm=4fb03d290f7e79df6a76ee59637c9188d89c840378bce5c4307bf19cb1a69845&

Denis

12/09/2024, 4:55 PM

By the way, in the Output YAML I had to hardcode the password in the database URI, otherwise to postgres throws an error:

dsn: postgres://kafkatopostgresqlv2:{{ include "united-manufacturing-hub.postgresql.kafkatopostgresqlv2.password" . }}@united-manufacturing-hub.united-manufacturing-hub.svc.cluster.local:5432/umh_v2

Ferdinand

12/09/2024, 5:13 PM

The input message in your case is invalid. You defined created_at as string, but gave it an integer. As for the input_received, it might only increase when a message is being commited as processed back to kafka

Ferdinand

12/09/2024, 5:14 PM

https://www.jsonschemavalidator.net/s/az5n4kfL

Denis

12/09/2024, 5:28 PM

Oops, that was it.. thanks! I'm bookmarking your tool 😉

Denis

12/12/2024, 3:11 PM

We now have the following architecture for our bridges: 1) One mqtt_to_kafka_mes_bridge that transfers and combines messages from several

umh/v1/<ISA-95 hierarchy>/<asset>/_mes/<MES table>

MQTT topics into a single kafka topic

umh.v1.<ISA-95 hierarchy>.mes

where the kafka key is equal to

<MES table>

2) An individual kafka_to_timescale_mes_bridge for each

<MES table>

that transfers kafka messages from the

umh.v1.<ISA-95 hierarchy>.mes

topic into a database table based on the value of the

kafka_key

. Here's an example of kafka messages the Redpanda UI (see image). Each bridge works individually when tested with an "inject" node in Node-Red. However, whenever both bridges are activated, and thus both read from the same kafka topic, randomly one one bridge manages to get its data into the database. It's as if it's not possible to have multiple bridges listening to the same kafka topic if all messages have the same timestamp. Could this be the case? The solution would be to either split the single
umh.v1.<ISA-95 hierarchy>.mes
topic into may topics like
umh.v1.<ISA-95 hierarchy>.mes**.work_order**
etc. Doesn't work: see below Alternatively, we would combine all the individual MES bridges into one large MES bridge and filter by kafka_key. This is possible, but it's more messy. Therefore I'd like to ask if my assumption about this being a problem is correct https://cdn.discordapp.com/attachments/1313876222429630525/1316784501061324800/image.png?ex=675c4ea1&is=675afd21&hm=d049f35a80e8780485a2c842da748073c0312dbc62bbc551f4361ec3c9aa8ce4&

Denis

12/12/2024, 3:23 PM

I just tested the first solution (where we split the

umh.v1.<ISA-95 hierarchy>.mes

topic into may topics like

umh.v1.<ISA-95 hierarchy>.mes.work_order

etc.) and the same problem occurs: The first bridge 1) works flawlessly, but the kafka_to_timescale bridges don't run. It's as if they are not being activated and are ignoring new messages in the topics they listen to. Any tips on how to debug such a bridge? I already posted the code here: https://discord.com/channels/700613971941785680/1316366172907638835/1316779471692628019

Jermuk

12/12/2024, 5:59 PM

> Each bridge works individually when tested with an "inject" node in Node-Red. However, whenever both bridges are activated, and thus both read from the same kafka topic, randomly one one bridge manages to get its data into the database. did you checked the name for the consumer group? this must be different 🙂

Jermuk

12/12/2024, 6:00 PM

🙂 https://cdn.discordapp.com/attachments/1313876222429630525/1316826983602323457/image.png?ex=675c7632&is=675b24b2&hm=c6dd1bc0e3e6439d4f4d6f50f67cddfa93924a7ef6c7edd99e262b39f1101807&

Jermuk

12/12/2024, 6:00 PM

if two consumers are in the same group, they will load ablacne the messages

Denis

12/12/2024, 7:01 PM

Ahh 99% that’s the issue: we use the same consumer group. Will check tomorrow

Denis

12/13/2024, 11:33 AM

Confirming that this was indeed the problem

Jermuk

12/13/2024, 11:59 AM

❤️

Denis

12/13/2024, 3:56 PM

Is it problematic to create the SQL schema outside of the bridge(s)? We have a file that instantiates the schema like in the attached SQL file https://cdn.discordapp.com/attachments/1313876222429630525/1317158147621650453/message.txt?ex=675daa9e&is=675c591e&hm=c3fbdfa371bd2a6580e1b51a77f5911f30e2f403511be5c6c0d8d8d30762c7e4&

Jermuk

12/13/2024, 5:11 PM

no, from a technical perspective not. might be helpful to have "everything together" and be able to reproduce the entire stack within the single yaml

5 Views

Previous Next