Is there a way to get a prompt on a deployed service?
addamsson
PROOP

2 years ago

I'm trying to debug my observability setup (whether prometheus is accessible from my deployed app). Is this possible?

Solved

128 Replies

addamsson
PROOP

2 years ago

c96ea2ca-1b03-415f-8931-5af0e51c87c4


addamsson
PROOP

2 years ago

what's weird is that i see an arroww from my app to postgres but not to my other deployments...is is possible that they are not on the same n etwork?

1281709292281663689


addamsson
PROOP

2 years ago

I tried this in local and it worked


2 years ago

> Is there a way to get a prompt on a deployed service

there is not, railway does not provide a way to ssh into a service.

> whether prometheus is accessible from my deployed app

it is as long as you are listening on ipv6 and using the correct port, as railway's private network is ipv6 only.

> what's weird is that i see an arroww from my app to postgres but not to my other deployments

the arrows and their directions are dynamically detected when you use reference variables, you are likely hard coding variables when you shouldn't be -

> is is possible that they are not on the same n etwork?

its not possible, every service within a given environment in a project is in the same private network.

> I tried this in local and it worked

local is likely ipv4, your services need to listen on an ipv6 host, or ideal they should all dual stack bind.


addamsson
PROOP

2 years ago

this is the first time i hear about this (and i've been using aws a lot)...what's an ipv6 host and how do i listen to one? what's a dual stack bind? 😅


addamsson
PROOP

2 years ago

this worked literally everywhere before (local, aws, heroku) and it only fails on railway


addamsson
PROOP

2 years ago

are there docs on this?


addamsson
PROOP

2 years ago

also why does it work this way?


addamsson
PROOP

2 years ago

it forces people to change their code so that they can facilitate the quirks of railway


addamsson
PROOP

2 years ago

there must be a good reason


addamsson
PROOP

2 years ago

i don't even understand why ipv6 is a concern when i'm using domain names eg: "http://prometheus:1234"


addamsson
PROOP

2 years ago

i checked and there is no docs on this


addamsson
PROOP

2 years ago

id mentions that private networking uses wireguard


addamsson
PROOP

2 years ago

and some sort of mesh


addamsson
PROOP

2 years ago

(whatever those might be)


addamsson
PROOP

2 years ago

but no details on what this means when i try to send metrics/traces


2 years ago

IPv6 is not a quirk at all, and not something specific to Railway, you just happen to always work on networks with IPv4 before.

There are docs for this -


2 years ago

because the domain names will resolve to an IPv6 address


addamsson
PROOP

2 years ago

so why can't i send metrics from prometheus and traces from tempo?


addamsson
PROOP

2 years ago

I get that, but the fact is that my code worked everywhere else


addamsson
PROOP

2 years ago

i only have to change it because of railway


2 years ago

yep, as mentioned everywhere else just happened to be IPv4


addamsson
PROOP

2 years ago

you don't get my point


addamsson
PROOP

2 years ago

and this is a tendency here


addamsson
PROOP

2 years ago

you implement something, cut corners by not implementing ipv4, and then you imply that all of this is totally OK, and your users are to blame for not thinking about the corners you cut 😅


addamsson
PROOP

2 years ago

and this is not the first case 😒


addamsson
PROOP

2 years ago

i'm not saying that you're not right, but from a user perspective it is a slap on the face


2 years ago

not implementing IPv4 is not corner cutting, IPv6 was introduced over two decade ago


addamsson
PROOP

2 years ago

in your opinion


addamsson
PROOP

2 years ago

see, that again


addamsson
PROOP

2 years ago

my program works everywhere but here


addamsson
PROOP

2 years ago

but all blame is assigned to me


2 years ago

I don't like to compare but Fly.io's private network is also IPv6 only


addamsson
PROOP

2 years ago

haven't used it


2 years ago

Just an example to show that we are not the only ones who choose to use a newer standard


addamsson
PROOP

2 years ago

i get that, but it is besides my point


addamsson
PROOP

2 years ago

i'm not arguing that this is a better way


addamsson
PROOP

2 years ago

or more forward-compatible


addamsson
PROOP

2 years ago

so if i understand this correctly the problem is tempo / prometheus is not configured to listen on ipv6? i'm using a metric exporter


addamsson
PROOP

2 years ago

interestingly enough grafana can connect to both


2 years ago

If your app is attempting to connect to temp / prometheus, they need to listen on IPv6, that is correct


addamsson
PROOP

2 years ago

i'm using a fork of a railway template


addamsson
PROOP

2 years ago

that's why i'm assuming that it would work


2 years ago

you'd also need to be using the correct port when connecting to them, same way you use ports even developing locally


addamsson
PROOP

2 years ago

i'm doing just that in theory


addamsson
PROOP

2 years ago

const NodeSdkLive = NodeSdk.layer(() => ({
    resource: { serviceName: "Larisel" },
    spanProcessor: new BatchSpanProcessor(
        new OTLPTraceExporter({
            url: "http://tempo:3100",
        }),
    ),
    instrumentations: [getNodeAutoInstrumentations()],
    // metricReader: new PrometheusExporter({ port: 9090 }),
    metricReader: new PeriodicExportingMetricReader({
        exportIntervalMillis: 500,
        exporter: new OTLPMetricExporter({
            url: "http://prometheus:9090",
        }),
    }),
}));

2 years ago

The vast majority of templates are user provided, it's possible they didn't fully test the template?


addamsson
PROOP

2 years ago

i don't know, maybe they have some other use case


2 years ago

can you send me over some actual error messages?


addamsson
PROOP

2 years ago

there are no errors interstingly


addamsson
PROOP

2 years ago

i'm using the official otel library


addamsson
PROOP

2 years ago

i just don't see the metrics


addamsson
PROOP

2 years ago

(nor the traces)


2 years ago

is there perhaps a verbose debug mode you can turn on?


addamsson
PROOP

2 years ago

i'll take a look


addamsson
PROOP

2 years ago

but i think the main problem is that prom/tempo is not configured for ipv6


addamsson
PROOP

2 years ago

i'm trying to use the setup that I was using before (metrics through prom, traces through tempo, aggregated in grafana)


addamsson
PROOP

2 years ago

once i figure this out i can share the code too if somebody is interested


2 years ago

I'm sure future readers would love that


addamsson
PROOP

2 years ago

👍


addamsson
PROOP

2 years ago

maybe i can post an article about it


addamsson
PROOP

2 years ago

anyway


addamsson
PROOP

2 years ago

is it easier / more secure / other if you use ipv6?


addamsson
PROOP

2 years ago

what's the rationale behind it?


2 years ago

I'm sorry I wouldn't know the exact reasons why it was chosen, I was not around for that discussion


addamsson
PROOP

2 years ago

ah, ok


addamsson
PROOP

2 years ago

in the docs it says that i have to use x.railway.internal in order for this to work, but in the railway app it says "you can also call me at "x". so would it make a difference if i used x.railway.internal:1234 instead of x:1234 or is it the same?


2 years ago

it shouldn't, I've never seen it make a difference


addamsson
PROOP

2 years ago

have you ever used prometheus or tempo on railway?


2 years ago

I have not


addamsson
PROOP

2 years ago

the search goes on then 😄


2 years ago

have you got your services to listen on ipv6?


addamsson
PROOP

2 years ago

i had to enable the otlp write receiver


addamsson
PROOP

2 years ago

now i can see metrics in prom


2 years ago

awsome!


addamsson
PROOP

2 years ago

looks like the ipv6 settings crash tempo


addamsson
PROOP

2 years ago


addamsson
PROOP

2 years ago

but when i push this to railway i get a wall of errors


addamsson
PROOP

2 years ago

1282786184388673607


addamsson
PROOP

2 years ago

this on repeat


2 years ago

link me to the specific service please


addamsson
PROOP

2 years ago

what is the name of the network interface?


addamsson
PROOP

2 years ago

i think eth0 doesn't exist


2 years ago

railnet0, but it shouldnt matter since you shouldnt be hardcoding any interfaces


addamsson
PROOP

2 years ago

b502be2b-037a-47c7-8744-2fe5e2c93a2d


addamsson
PROOP

2 years ago

tempo does that i think


2 years ago

bad practice 😬


2 years ago

what env?



addamsson
PROOP

2 years ago

staging


addamsson
PROOP

2 years ago

level=info ts=2024-09-09T19:40:18.412940643Z caller=main.go:121 msg="Starting Tempo" version="(version=r165-7421936, branch=r165, revision=7421936ba)"

level=info ts=2024-09-09T19:40:18.414789251Z caller=cache.go:55 msg="caches available to storage backend" parquet-footer=false bloom=false parquet-offset-idx=false parquet-column-idx=false trace-id-index=false parquet-page=false

level=info ts=2024-09-09T19:40:18.41779872Z caller=server.go:249 msg="server listening on addresses" http=[::]:3100 grpc=[::]:9095

level=info ts=2024-09-09T19:40:18.418135313Z caller=cache.go:55 msg="caches available to storage backend" parquet-footer=false bloom=false parquet-offset-idx=false parquet-column-idx=false trace-id-index=false parquet-page=false

level=warn ts=2024-09-09T19:40:18.42118818Z caller=netutil.go:90 msg="error getting addresses for interface" inf=eth0 err="route ip+net: no such network interface"

level=info ts=2024-09-09T19:40:18.421222206Z caller=memberlist_client.go:439 msg="Using memberlist cluster label and node name" cluster_label= node=e876d5c06828-2d8cd974

level=warn ts=2024-09-09T19:40:18.421335895Z caller=netutil.go:90 msg="error getting addresses for interface" inf=en0 err="route ip+net: no such network interface"

level=error ts=2024-09-09T19:40:18.421379281Z caller=main.go:124 msg="error running Tempo" err="failed to init module services: error initialising module: compactor: failed to create compactor: no useable address found for interfaces [eth0 en0]"

2 years ago

best to not hardcode an interface, but if you have to its railnet0 -

now that im looking at this, i think i should note that the ipv4 address is used for public traffic.


addamsson
PROOP

2 years ago

i don't hardcode stuff, but this comes from the tracing tool i've been using (Grafana Tempo)


addamsson
PROOP

2 years ago

you mean the public api?


addamsson
PROOP

2 years ago

i only added a public api so that i can check it with postman


addamsson
PROOP

2 years ago

that's how i figured out what the problem was with prom


2 years ago

i didnt mention a public api?


addamsson
PROOP

2 years ago

i'm not sure what you meant


addamsson
PROOP

2 years ago

😅


addamsson
PROOP

2 years ago

ok it seems the interface name is hardcoded in many places



addamsson
PROOP

2 years ago

lemme try this


2 years ago

look at the link i sent for context on what i said


addamsson
PROOP

2 years ago

railnet0, right?


2 years ago

yes, please look at the link


addamsson
PROOP

2 years ago

the stats page?


addamsson
PROOP

2 years ago

let's see if this solves the issue


addamsson
PROOP

2 years ago

ok, tempo booted up


addamsson
PROOP

2 years ago

still no traces though


2 years ago

thats not ideal


addamsson
PROOP

2 years ago

someting is amyss, but i see no requests in the log


addamsson
PROOP

2 years ago

so it might be in my service


addamsson
PROOP

2 years ago

when i tamper with tempo from grafana i can see it in the logs


addamsson
PROOP

2 years ago

good news is that app -> prom -> grafana is now working


2 years ago

awsome!


addamsson
PROOP

2 years ago

is there a way to expose a port on a deployment?


addamsson
PROOP

2 years ago

i think tempo is listening on a hardcoded port


2 years ago

publicly?


addamsson
PROOP

2 years ago

no, only on the private network


2 years ago

you dont need to do anything for that, there is no firewall or anything


addamsson
PROOP

2 years ago

oh, i'll try something then


addamsson
PROOP

2 years ago

FUCK YES

1282803491546468552


addamsson
PROOP

2 years ago

thanks for the help, everything seems to be working now!


2 years ago

what was the final fix?


canastro
HOBBY

a year ago

I'm facing similar issues, @addamsson can you share your tempo.yaml?

Right now Im facing this issue:

level=error ts=2025-01-03T15:29:42.508327114Z caller=app.go:223 msg="module failed" module=distributor err="starting module distributor: invalid service state: Failed, expected: Running, failure: failed to start subservices: not healthy, 0 terminated, 1 failed: [error starting receiver: listen tcp [fd12:2222:f2cb:0:2000:8:eb5f:db2]:4317: bind: cannot assign requested address]"

changed tempo.yaml to:

distributor:
  ring:
    kvstore:
      store: memberlist
    instance_interface_names:
      - railnet0
  receivers:
    otlp:
      protocols:
        grpc:
          endpoint: "0.0.0.0:4317"
        http:
          endpoint: "0.0.0.0:4318"

And now the service starts and I'm able to connect from grafana...

But my app is failing to publish any traces... although it works locally (I'm also using effect.ts)


canastro
HOBBY

a year ago

finally got it working for anyone looking into this in the future:

tempo.yaml:

---
server:
  grpc_listen_address: '::0'
  grpc_listen_port: 9095
  http_listen_address: '::0'
  http_listen_port: 3200

query_frontend:
  search:
    duration_slo: 5s
    throughput_bytes_slo: 1.073741824e+09
  trace_by_id:
    duration_slo: 100ms

distributor:
  ring:
    kvstore:
      store: memberlist
    instance_interface_names:
      - railnet0
  receivers:
    otlp:
      protocols:
        grpc:
          endpoint: "0.0.0.0:4317"
        http:
          endpoint: "0.0.0.0:4318"

ingester:
  lifecycler: 
    ring:
      replication_factor: 1
      kvstore:
        store: memberlist
    address: '::'
    enable_inet6: true
  max_block_duration: 5m # cut the headblock when this much time passes. this is being set for demo purposes and should probably be left alone normally

memberlist:
  bind_addr:
    - '::'
  bind_port: 7946

compactor:
  ring:
    instance_interface_names:
      - railnet0
    kvstore:
      store: memberlist
    enable_inet6: true
  compaction:
    block_retention: 1h # overall Tempo trace retention. set for demo purposes

metrics_generator:
  ring:
    instance_interface_names:
      - railnet0
    enable_inet6: true
  registry:
    external_labels:
      source: tempo
      cluster: docker-compose
  storage:
    path: /var/tempo/generator/wal
    remote_write:
      - url: http://prometheus:9090/api/v1/write
        send_exemplars: true
  traces_storage:
    path: /var/tempo/generator/traces
  processor:
    local_blocks:
      filter_server_spans: false
      flush_to_storage: true

storage:
  trace:
    backend: local # backend configuration to use
    wal:
      path: /var/tempo/wal # where to store the the wal locally
    local:
      path: /var/tempo/blocks

overrides:
  defaults:
    metrics_generator:
      processors: [service-graphs, span-metrics, local-blocks] # enables metrics generator
      generate_native_histograms: both

canastro
HOBBY

a year ago

Make sure to add a volume to your tempo service, mount it in /var/tempo and add RAILWAY_RUN_UID="0" on tempo service


a year ago

!s


Status changed to Solved adam over 1 year ago


Welcome!

Sign in to your Railway account to join the conversation.

Loading...