did someone notice or experience an issue or limitation with REST API?
I have more than 3000 entries in one collection.
When I am fetching all of them via API I am receiving Gateway timeout after 60 seconds. So the solution was to create pagination and use skip and limit query parameters.
However, still, I can see the difference that requests that are handling entries between 1 and 3000 are finished in 1 second (limit 100, 500) and the request that is handling a call for records after position 3000 is very slow (for limit 50 or 100 entries around 50 seconds and even more).
Until the amount of less than 3000, everything is working fine.
Can someone explain what is causing it? Do you have some hints on how to solve it?
An increase in timeout is not a solution, I would like to solve the real cause.
thank you for your reply.
Server resources should be way than enough (4GB RAM, 2 vCPU).
I tested the whole process for limits 25, 50, 100, 500, and 1000 (number of entries per API request). Especially is clearly visible for limit 25 that every request with an offset above 3000 entries is very slow. Usually, it takes 1-2 seconds, and for API requests after offset 3000, it takes 8, 16, or 24 seconds only for 25 entries.
Collection entries are similar, there is nothing special for the entries that are fetched very slowly.
As you’re using SQLite, which is accessed via file based reads, I suppose that IO operations are slow. Do you use eg a mounted network storage? Simple data models shouldn’t take more than 30 - 100ms to respond (based on the amount)
I went through all entries that were created as last (all entries above 3000) and there is nothing special. All entries have all required values and nothing additional.
The collection contains 16 fields
1 moderation field
7 text fields
1 boolean field
1 wysiwyg field
1 cloudinary field
3 collection link fields - simple select
1 layout field for layout components
1 set field (2x text, textarea, boolean)
No loops in collection links.
Without pagination, it will end up in Gateway timeout or with a 500 limit per API request.
But still only for the request that is handling the entries after 3000.
Details
Cockpit version: 0.12.1
PHP 7.4.33
Apache2
running on Docker
Addons
Cloudinary - for all assets (images)
Editor formats
FormValidation
Layout components
Moderation
custom component for additional field type and UI adjustments
I never used the Cloudinary or the Moderation addon, so I’m just guessing…
Like @artur said, it could be a slow HDD (on the cloud storage).
If I remember correctly, the Moderation addon adds more database requests to compare entries with revisions. So this could also be a reason for slowing it down.
One other thing: How big are your sqlite files? Maybe the data is very fragmented, which can be solved with vacuuming the databases.
To vacuum your sqlite files, you can copy this cleanup script into /path/to/cockpit/config/cli/db/cleanup.php and call it via cli:
cd /path/to/cockpit
php ./cp db/cleanup
The vacuum command is enabled by default. All other options need a command line argument, e. g. php ./cp db/cleanup --trash
Cloud storage is SSD, so it should not be a bottleneck.
Thank you for providing the cleanup script. I will definitely try. Is there a good reason to not trash? E.g. have a possibility to rollback (did not find something in the script code).
I wrote the cleanup script to shrink a small db from megabytes to a few kilobytes to keep it in git version control. Also I didn’t know about the trash feature for a long time. So to keep the file very small, it’s good to delete all unneded data.
It’s also a privacy thing. If I delete data, I assume, the data is gone instead of unpublished. The best data protection is to not store (unneeded) data.
E.g. have a possibility to rollback
The cleanup script has no rollback. Create a manual backup first. And run the script without the --trash argument to keep the trashed entries.
Do you think that size of the Cockpit sqlite file is too big?
No, but running the vacuum command doesn’t hurt. It was just a guess, because populating collection links and comparing entries against revisions could add up if the db is huge and fragmented. I always had much smaller databases, so this is academical for me. For what I read about this topic, you should be concerned when your sqlite files hit multiple (or hundreds of) gigabytes.
I am also thinking to add a RedisCache addon
I didn’t test it, but using Redis is always a good idea to cache subsequent db calls.
Thank you for your great and comprehensive responses.
I did that database clean-up and also connected the Redis with CockpitCMS. Seems that caching improved it a lot. But still, for an empty cache, it can be a problem.
For now, it solved my problems with Gatsby builds.
But still, I am really curious why it is happening and what is the real cause.
If I will find it, will update this thread.