2 years ago
I'm trying to debug my observability setup (whether prometheus is accessible from my deployed app). Is this possible?
128 Replies
what's weird is that i see an arroww from my app to postgres but not to my other deployments...is is possible that they are not on the same n etwork?

2 years ago
> Is there a way to get a prompt on a deployed service
there is not, railway does not provide a way to ssh into a service.
> whether prometheus is accessible from my deployed app
it is as long as you are listening on ipv6 and using the correct port, as railway's private network is ipv6 only.
> what's weird is that i see an arroww from my app to postgres but not to my other deployments
the arrows and their directions are dynamically detected when you use reference variables, you are likely hard coding variables when you shouldn't be -
> is is possible that they are not on the same n etwork?
its not possible, every service within a given environment in a project is in the same private network.
> I tried this in local and it worked
local is likely ipv4, your services need to listen on an ipv6 host, or ideal they should all dual stack bind.
this is the first time i hear about this (and i've been using aws a lot)...what's an ipv6 host and how do i listen to one? what's a dual stack bind? 😅
this worked literally everywhere before (local, aws, heroku) and it only fails on railway
it forces people to change their code so that they can facilitate the quirks of railway
i don't even understand why ipv6 is a concern when i'm using domain names eg: "http://prometheus:1234"
2 years ago
IPv6 is not a quirk at all, and not something specific to Railway, you just happen to always work on networks with IPv4 before.
There are docs for this -
2 years ago
because the domain names will resolve to an IPv6 address
2 years ago
yep, as mentioned everywhere else just happened to be IPv4
you implement something, cut corners by not implementing ipv4, and then you imply that all of this is totally OK, and your users are to blame for not thinking about the corners you cut 😅
i'm not saying that you're not right, but from a user perspective it is a slap on the face
2 years ago
not implementing IPv4 is not corner cutting, IPv6 was introduced over two decade ago
2 years ago
I don't like to compare but Fly.io's private network is also IPv6 only
2 years ago
Just an example to show that we are not the only ones who choose to use a newer standard
so if i understand this correctly the problem is tempo / prometheus is not configured to listen on ipv6? i'm using a metric exporter
2 years ago
If your app is attempting to connect to temp / prometheus, they need to listen on IPv6, that is correct
2 years ago
you'd also need to be using the correct port when connecting to them, same way you use ports even developing locally
const NodeSdkLive = NodeSdk.layer(() => ({
resource: { serviceName: "Larisel" },
spanProcessor: new BatchSpanProcessor(
new OTLPTraceExporter({
url: "http://tempo:3100",
}),
),
instrumentations: [getNodeAutoInstrumentations()],
// metricReader: new PrometheusExporter({ port: 9090 }),
metricReader: new PeriodicExportingMetricReader({
exportIntervalMillis: 500,
exporter: new OTLPMetricExporter({
url: "http://prometheus:9090",
}),
}),
}));2 years ago
The vast majority of templates are user provided, it's possible they didn't fully test the template?
2 years ago
can you send me over some actual error messages?
2 years ago
is there perhaps a verbose debug mode you can turn on?
i'm trying to use the setup that I was using before (metrics through prom, traces through tempo, aggregated in grafana)
2 years ago
I'm sure future readers would love that
2 years ago
I'm sorry I wouldn't know the exact reasons why it was chosen, I was not around for that discussion
in the docs it says that i have to use x.railway.internal in order for this to work, but in the railway app it says "you can also call me at "x". so would it make a difference if i used x.railway.internal:1234 instead of x:1234 or is it the same?
2 years ago
it shouldn't, I've never seen it make a difference
2 years ago
I have not
2 years ago
have you got your services to listen on ipv6?
2 years ago
awsome!
i copied the settings from this page: https://grafana.com/docs/tempo/latest/configuration/network/ipv6/
2 years ago
link me to the specific service please
2 years ago
railnet0, but it shouldnt matter since you shouldnt be hardcoding any interfaces
2 years ago
bad practice 😬
2 years ago
what env?
level=info ts=2024-09-09T19:40:18.412940643Z caller=main.go:121 msg="Starting Tempo" version="(version=r165-7421936, branch=r165, revision=7421936ba)"
level=info ts=2024-09-09T19:40:18.414789251Z caller=cache.go:55 msg="caches available to storage backend" parquet-footer=false bloom=false parquet-offset-idx=false parquet-column-idx=false trace-id-index=false parquet-page=false
level=info ts=2024-09-09T19:40:18.41779872Z caller=server.go:249 msg="server listening on addresses" http=[::]:3100 grpc=[::]:9095
level=info ts=2024-09-09T19:40:18.418135313Z caller=cache.go:55 msg="caches available to storage backend" parquet-footer=false bloom=false parquet-offset-idx=false parquet-column-idx=false trace-id-index=false parquet-page=false
level=warn ts=2024-09-09T19:40:18.42118818Z caller=netutil.go:90 msg="error getting addresses for interface" inf=eth0 err="route ip+net: no such network interface"
level=info ts=2024-09-09T19:40:18.421222206Z caller=memberlist_client.go:439 msg="Using memberlist cluster label and node name" cluster_label= node=e876d5c06828-2d8cd974
level=warn ts=2024-09-09T19:40:18.421335895Z caller=netutil.go:90 msg="error getting addresses for interface" inf=en0 err="route ip+net: no such network interface"
level=error ts=2024-09-09T19:40:18.421379281Z caller=main.go:124 msg="error running Tempo" err="failed to init module services: error initialising module: compactor: failed to create compactor: no useable address found for interfaces [eth0 en0]"2 years ago
best to not hardcode an interface, but if you have to its railnet0 -
now that im looking at this, i think i should note that the ipv4 address is used for public traffic.
i don't hardcode stuff, but this comes from the tracing tool i've been using (Grafana Tempo)
2 years ago
i didnt mention a public api?
2 years ago
look at the link i sent for context on what i said
2 years ago
yes, please look at the link
2 years ago
thats not ideal
2 years ago
awsome!
2 years ago
publicly?
2 years ago
you dont need to do anything for that, there is no firewall or anything
2 years ago
what was the final fix?
I'm facing similar issues, @addamsson can you share your tempo.yaml?
Right now Im facing this issue:
level=error ts=2025-01-03T15:29:42.508327114Z caller=app.go:223 msg="module failed" module=distributor err="starting module distributor: invalid service state: Failed, expected: Running, failure: failed to start subservices: not healthy, 0 terminated, 1 failed: [error starting receiver: listen tcp [fd12:2222:f2cb:0:2000:8:eb5f:db2]:4317: bind: cannot assign requested address]"changed tempo.yaml to:
distributor:
ring:
kvstore:
store: memberlist
instance_interface_names:
- railnet0
receivers:
otlp:
protocols:
grpc:
endpoint: "0.0.0.0:4317"
http:
endpoint: "0.0.0.0:4318"And now the service starts and I'm able to connect from grafana...
But my app is failing to publish any traces... although it works locally (I'm also using effect.ts)
finally got it working for anyone looking into this in the future:
tempo.yaml:
---
server:
grpc_listen_address: '::0'
grpc_listen_port: 9095
http_listen_address: '::0'
http_listen_port: 3200
query_frontend:
search:
duration_slo: 5s
throughput_bytes_slo: 1.073741824e+09
trace_by_id:
duration_slo: 100ms
distributor:
ring:
kvstore:
store: memberlist
instance_interface_names:
- railnet0
receivers:
otlp:
protocols:
grpc:
endpoint: "0.0.0.0:4317"
http:
endpoint: "0.0.0.0:4318"
ingester:
lifecycler:
ring:
replication_factor: 1
kvstore:
store: memberlist
address: '::'
enable_inet6: true
max_block_duration: 5m # cut the headblock when this much time passes. this is being set for demo purposes and should probably be left alone normally
memberlist:
bind_addr:
- '::'
bind_port: 7946
compactor:
ring:
instance_interface_names:
- railnet0
kvstore:
store: memberlist
enable_inet6: true
compaction:
block_retention: 1h # overall Tempo trace retention. set for demo purposes
metrics_generator:
ring:
instance_interface_names:
- railnet0
enable_inet6: true
registry:
external_labels:
source: tempo
cluster: docker-compose
storage:
path: /var/tempo/generator/wal
remote_write:
- url: http://prometheus:9090/api/v1/write
send_exemplars: true
traces_storage:
path: /var/tempo/generator/traces
processor:
local_blocks:
filter_server_spans: false
flush_to_storage: true
storage:
trace:
backend: local # backend configuration to use
wal:
path: /var/tempo/wal # where to store the the wal locally
local:
path: /var/tempo/blocks
overrides:
defaults:
metrics_generator:
processors: [service-graphs, span-metrics, local-blocks] # enables metrics generator
generate_native_histograms: bothMake sure to add a volume to your tempo service, mount it in /var/tempo and add RAILWAY_RUN_UID="0" on tempo service
a year ago
!s
Status changed to Solved adam • over 1 year ago

