GLIBC_2.38 not found error when deploy Fastapi with crawl4ai (using playwright)

van-thiepHOBBY

2 months ago

Hi there, I'm using crawl4ai (using playwright) to crawl and extract data and use Fastapi to write api. I run my code successfully on local. But I got error: /opt/venv/lib/python3.12/site-packages/playwright/driver/node: /lib/x8664-linux-gnu/libc.so.6: version `GLIBC2.38' not found (required by /nix/store/s94fwp43xhzkvw8l8nqslskib99yifzi-gcc-13.3.0-lib/lib/libstdc++.so.6) when deploy and run on Railway.

I googled it and saw a recommendation that I should use Browserless templates offered by Railway and found this repo: https://github.com/brody192/playwright-example-python. In this repo, i need to convert from browser = await p.chromium.launch() to browser = await p.chromium.connect(os.environ['BROWSERPLAYWRIGHTENDPOINT']).

But I'm using crawl4ai, which uses Playwright under the hood, and I don't see any parameter in the crawl4ai doc to which I can apply the above code.

Any way to fix that issue? Walkaround is ok. Because I'm new and need to deploy to production fast for our user. Thanks!

Project id: 86b6d50c-4a4d-423a-81d5-4e7a0d38cacd

Solved

0 Replies

van-thiepHOBBY

2 months ago

Hello, any help on this issue?


2 months ago

hello! can you try to switch to railpack in your railway.json file


van-thiepHOBBY

2 months ago

Hi, I changed it to railpack and added custom install command, but while nixpack installed it (playwright install), railpack didn't. here is my railway.json file

{
"$schema": "https://railway.app/railway.schema.json",
"build": {
"builder": "RAILPACK",
"nixpacksPlan": {
"phases": {
"setup": {
"nixPkgs": [
"python3",
"gcc"
]
},
"install": {
"dependsOn": [
"setup"
],
"cmds": [
"pip install -r requirements.txt",
"python -m playwright install --with-deps chromium"
]
}
}
}
},
"deploy": {
"startCommand": "hypercorn main:app --bind \"[::]:$PORT\""
}
}


2 months ago

i hope you can see the issue there, you set the builder to railpack, and then tried to use nixpacks config


van-thiepHOBBY

2 months ago

Sorry, my bad. But I don't see any the corresponding key for railpack in schema.json (such as railpackPlan).

I've also tried to add RAILPACKINSTALLCMD=pip install -r requirements.txt && python -m playwright install --with-deps chromium and change railway.json to
{ "$schema": "[https://railway.app/railway.schema.json](https://railway.app/railway.schema.json)", "build": { "builder": "RAILPACK" }, "deploy": { "startCommand": "hypercorn main:app --bind \"[::]:$PORT\"" } }
But it didn't run custom install command


2 months ago

try adding this to a railpack.json file -

{
  "$schema": "https://schema.railpack.com",
  "steps": {
    "playwright": {
      "inputs": [{ "step": "install" }],
      "commands": ["python -m playwright install --with-deps chromium"]
    }
  },
  "deploy": {
    "inputs": ["...", { "step": "playwright" }]
  }
}

van-thiepHOBBY

2 months ago

I got this error:

`✕ python -m playwright install --with-deps chromium
process "sh -c python -m playwright install --with-deps chromium" did not complete successfully: exit code: 1

✕ docker-image://ghcr.io/railwayapp/railpack-runtime:latest
failed to do request: Head "https://ghcr.io/v2/railwayapp/railpack-runtime/manifests/latest": context canceled: context canceled`

And Do I have to keep both railway.json and railpack.json (in the setting, I'm setting the path to railway.json)?

I'm not a technical person, so sorry if i ask stupid question.


van-thiepHOBBY

2 months ago

When I try to access the link: https://ghcr.io/v2/railwayapp/railpack-runtime/manifests/latest, I got error: {"errors":[{"code":"UNAUTHORIZED","message":"authentication required"}]}


van-thiepHOBBY

2 months ago

I checked the deployment log and saw it runs "playwright install" successfully. But when I ran my application, I saw this error in the logs:

1352228779388305400


van-thiepHOBBY

2 months ago

Although, in the build logs, I saw that Playwright was downloaded to this path

1352230061599621000


2 months ago

@jr - when you have a moment, do you have an idea on how you would include what playwright has installed in the deploy image?


2 months ago

You need to specify the directory to include (the warning is being swalled and not included in the build logs). This should work

{
  "$schema": "https://schema.railpack.com",
  "provider": "python",
  "steps": {
    "playwright": {
      "inputs": [{ "step": "install" }],
      "commands": ["python -m playwright install --with-deps chromium"]
    }
  },
  "deploy": {
    "inputs": [
      "...",
      {
        "step": "playwright",
        "include": ["/root/.cache"]
      }
    ],
    "startCommand": "hypercorn main:app --bind \"[::]:$PORT\""
  }
}

2 months ago

i was close


2 months ago

I'm iterating on this lately so this is good feedback


2 months ago

i had thought just specifying the step would also include its files


2 months ago

that would include all of the OS system files and I'm trying to avoid merge those because it is slow and can blow up the image


2 months ago

i see


2 months ago

but I can see how it is confusing. I'll experiment with somethings


van-thiepHOBBY

2 months ago

Hi, I tried your config file, and then the build and deployment were successful. But I got this error when running the app

1352465590014709800


van-thiepHOBBY

2 months ago

and here are the build logs

1352467133501542400
1352467133770104800


2 months ago

hmm it is because the playwright deps are not being installed in the runtime image. Can you try changing to

{
  "$schema": "https://schema.railpack.com",
  "provider": "python",
  "steps": {
    "playwright": {
      "inputs": [{ "step": "build" }],
      "commands": ["python -m playwright install --with-deps chromium"]
    }
  },
  "deploy": {
    "inputs": [{ "step": "playwright" }],
    "startCommand": "hypercorn main:app --bind \"[::]:$PORT\""
  }
}

Its not ideal, but it is something we can soon address in Railpack


van-thiepHOBBY

2 months ago

It works now. Thanks a lot for your support!


2 months ago

!s


Status changed to Solved brody 2 months ago