oldestxmin Is Stuck Due to Long Transactions
Symptom
VACUUM of Astore can only clean up dead tuples that have been committed and whose transaction IDs are smaller than oldestxmin. If a long transaction is not committed for a long time, oldestxmin cannot be updated. As a result, AUTOVACUUM cannot reclaim dead tuples, causing Astore space bloat.
Fault Locating
To solve this problem, use the gs_txid_oldestxmin() view to query oldestxmin and use the txid_current() view to query the latest XID. Then, you can determine whether oldestxmin is stuck based on the difference between the two views.
select txid_current() as xid, gs_txid_oldestxmin() as oldestxmin, xid - oldestxmin as diff;
Example:
gaussdb=# select txid_current() as xid, gs_txid_oldestxmin() as oldestxmin, xid - oldestxmin as diff;
xid | oldestxmin | diff
--------+------------+------
220790 | 220789 | 1
(1 row)
Generally, a difference of less than 10,000 between the two is considered normal. If the difference is too large, for example, several hundred thousand or even millions, long transactions exist in the system. You can use pg_stat_activity to obtain information about blocked transactions.
select 'elp:'|| now() - xact_start diff, query, datname, pid, usename, xact_start, enqueue, state, query_id, query from pg_stat_activity where query not like 'WLM%' order by diff desc; select * from pg_running_xacts;
Example:
gaussdb=# select 'elp:'|| now() - xact_start diff, query, datname, pid, usename, xact_start, enqueue, state, query_id, query from pg_stat_activity where query not like 'WLM%' and state='active' order by diff desc;
diff | query
| datname | pid | usename | xact_start | enqueue | state | query_
id | query
--------------+---------------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------+----------+-----------------+-----------+-------------------------------+---------+--------+-----------
-------+----------------------------------------------------------------------------------------------------------------------------------------------------------
---------------------------------------------------
elp:00:00:00 | select 'elp:'|| now() - xact_start diff, query, datname, pid, usename, xact_start, enqueue, state, query_id, query from pg_stat_activity where que
ry not like 'WLM%' and state='active' order by diff desc; | postgres | 140123736897280 | postgres | 2025-02-09 13:50:21.628494-05 | | active | 3096224744
018863 | select 'elp:'|| now() - xact_start diff, query, datname, pid, usename, xact_start, enqueue, state, query_id, query from pg_stat_activity where query not
like 'WLM%' and state='active' order by diff desc;
0 |
elp: |
| postgres | 140124852582144 | postgres | | | active |
0 |
elp: |
| postgres | 140124747724544 | postgres | | | active |
0 |
elp: |
| postgres | 140124232349440 | postgres | | | active |
0 |
(6 rows)
gaussdb=# select * from pg_running_xacts where pid=140125682005760;
handle | gxid | state | node | xmin | vacuum | timeline | prepare_xid | pid | next_xid | dbid
--------+--------+-------+--------+------+--------+----------+-------------+-----------------+----------+-------
-1 | 220796 | 0 | sgnode | 0 | f | 11 | 0 | 140125682005760 | 0 | 12912
(1 row)
In addition, logical replication slots may also block the update of oldestxmin. You can check the replication slot information as follows:
select * from pg_get_replication_slots();
Example:
gaussdb=# select * from pg_get_replication_slots();
slot_name | plugin | slot_type | datoid | active | xmin | catalog_xmin | restart_lsn | dummy_standby | confirmed_flush | confirmed_csn | dictionary_csn_min | slo
t_dictionary_type
-----------+--------+-----------+--------+--------+------+--------------+-------------+---------------+-----------------+---------------+--------------------+----
------------------
dn_6003 | | physical | 0 | t | | | 0/15E52890 | f | | | |
dn_6002 | | physical | 0 | t | | | 0/15E52890 | f | | | |
(2 rows)
Workarounds
After identifying the active thread that blocks the update of oldestxmin, terminate the corresponding thread.
Feedback
Was this page helpful?
Provide feedbackThank you very much for your feedback. We will continue working to improve the documentation.See the reply and handling status in My Cloud VOC.
For any further questions, feel free to contact us through the chatbot.
Chatbot