Z Archiver

1 view

Skip to first unread message

Sacripant Ladson

unread,

Jan 21, 2024, 4:13:25 PM1/21/24

to highrabzaro

pt-archiver is extensible via a plugin mechanism. You can inject your owncode to add advanced archiving logic that could be useful for archivingdependent data, applying complex business rules, or building a data warehouseduring the archiving process.

pt-archiver does not check for error when it commits transactions.Commits on PXC can fail, but the tool does not yet check for or retry thetransaction when this happens. If it happens, the tool will die.

z archiver

DOWNLOAD ★ https://t.co/EvOmSqeDjd

If you specify --progress, the output is a header row, plus status outputat intervals. Each row in the status output lists the current date and time,how many seconds pt-archiver has been running, and how many rows it hasarchived.

If you do want to use the ascending index optimization (see --no-ascend),but do not want to incur the overhead of ascending a large multi-column index,you can use this option to tell pt-archiver to ascend only the leftmost columnof the index. This can provide a significant performance boost over notascending the index at all, while avoiding the cost of ascending the wholeindex.

Enabled by default; causes pt-archiver to check that the source and destinationtables have the same columns. It does not check column order, data type, etc.It just checks that all columns in the source exist in the destination andvice versa. If there are any differences, pt-archiver will exit with anerror.

Specify a comma-separated list of columns to fetch, write to the file, andinsert into the destination table. If specified, pt-archiver ignores othercolumns unless it needs to add them to the SELECT statement for ascending anindex or deleting rows. It fetches and uses these extra columns internally, butdoes not write them to the file or to the destination table. It does passthem to plugins.

This option is useful as a shortcut to make --limit and --txn-size thesame value, but more importantly it avoids transactions being held open whilesearching for more rows. For example, imagine you are archiving old rows fromthe beginning of a very large table, with --limit 1000 and --txn-size1000. After some period of finding and archiving 1000 rows at a time,pt-archiver finds the last 999 rows and archives them, then executes the nextSELECT to find more rows. This scans the rest of the table, but never finds anymore rows. It has held open a transaction for a very long time, only todetermine it is finished anyway. You can use --commit-each to avoid this.

WARNING: Using a default options file (F) DSN option that defines asocket for --source causes pt-archiver to connect to --dest usingthat socket unless another socket for --dest is specified. Thismeans that pt-archiver may incorrectly connect to --source when itconnects to --dest. For example:

The default ascending-index optimization causes pt-archiver to optimizerepeated SELECT queries so they seek into the index where the previous queryended, then scan along it, rather than scanning from the beginning of the tableevery time. This is enabled by default because it is generally a good strategyfor repeated accesses.

Adds an extra WHERE clause to prevent pt-archiver from removing the newestrow when ascending a single-column AUTO_INCREMENT key. This guards againstre-using AUTO_INCREMENT values if the server restarts, and is enabled bydefault.

The extra WHERE clause contains the maximum value of the auto-increment columnas of the beginning of the archive or purge job. If new rows are inserted whilept-archiver is running, it will not see them.

The presence of the file specified by --sentinel will cause pt-archiver tostop archiving and exit. The default is /tmp/pt-archiver-sentinel. Youmight find this handy to stop cron jobs gracefully if necessary. See also--stop.

Causes pt-archiver to create the sentinel file specified by --sentinel andexit. This should have the effect of stopping all running instances which arewatching the same sentinel file. See also --unstop.

Specifies the size, in number of rows, of each transaction. Zero disablestransactions altogether. After pt-archiver processes this many rows, itcommits both the --source and the --dest if given, and flushes thefile given by --file.

This parameter is critical to performance. If you are archiving from a liveserver, which for example is doing heavy OLTP work, you need to choose a goodbalance between transaction size and commit overhead. Larger transactionscreate the possibility of more lock contention and deadlocks, but smallertransactions cause more frequent commit overhead, which can be significant. Togive an idea, on a small test set I worked with while writing pt-archiver, avalue of 500 caused archiving to take about 2 seconds per 1000 rows on anotherwise quiet MySQL instance on my desktop machine, archiving to disk and toanother table. Disabling transactions with a value of zero, which turns onautocommit, dropped performance to 38 seconds per thousand rows.

Causes pt-archiver to print a message if it exits for any reason other thanrunning out of rows to archive. This can be useful if you have a cron job with--run-time specified, for example, and you want to be sure pt-archiver isfinishing before running out of time.

This method is called just before pt-archiver begins iterating through rowsand archiving them, but after it does all other setup work (examining tablestructures, designing SQL queries, and so on). This is the only timept-archiver tells the plugin column names for the rows it will pass theplugin while archiving.

The cols argument is the column names the user requested to be archived,either by default or by the --columns option. The allcols argument isthe list of column names for every row pt-archiver will fetch from the sourcetable. It may fetch more columns than the user requested, because it needs somecolumns for its own use. When subsequent plugin functions receive a row, it isthe full row containing all the extra columns, if any, added to the end.

This method is called after pt-archiver exits the archiving loop, commits alldatabase handles, closes --file, and prints the final statistics, butbefore pt-archiver runs ANALYZE or OPTIMIZE (see --analyze and--optimize).

In addition to the Data Archiver module, the Archive Runs module is displayed as a subpanel under the record view of any data archiver record. The Archive Runs subpanel displays a history of runs that have occurred for the parent data archiver record, giving the administrator a clear history of what has occurred in the system and access to the affected record IDs.

Data Archiver jobs will run automatically on regularly set intervals when the Run Active Data Archives/Deletions scheduler is active. Whether the scheduler is active or not, an administrator may also run Data Archiver jobs manually as needed by clicking the "Perform Now" button on a data archiver record.

After saving an active Data Archiver record, the archive or deletion will automatically process the next time the Run Active Data Archives/Deletions scheduler runs. Alternatively, you may click on "Perform Now" in the data archiver job's record view to run the job immediately without activating or waiting for the scheduler.

When you hard-delete data via the Data Archiver, Sugar will preserve the IDs (and only the IDs) of the records that are deleted in a database table called archive_runs. All other data related to the hard-deleted records will be gone and not recoverable from any means other than a local backup. Therefore, we recommend backing up your database before performing hard-delete actions. Customers with access to their database can retrieve the list of IDs that were hard deleted in the row of the archive_runs table that is associated with the job that ran from the parent data_archivers record. SugarCloud customers can make and download a database backup to access the archive_runs table or create a report in the Advanced Reports module if they are using Sugar Sell or Serve. Once you have the deleted IDs, you may be able to restore hard-deleted records by comparing the IDs with your backup.

At this step of the wizard, you can optionally enable usage of the Azure archiver appliance when Veeam Backup for Microsoft 365 transfers backed-up data between different instances of Azure Blob Storage or to Azure Blob Storage Archive. If you use the Azure archiver appliance, it usually speeds up the backup copy process and helps you reduce costs incurred by your cloud storage provider.

The Azure archiver appliance is a small auxiliary machine in Microsoft Azure that is deployed and configured automatically by Veeam Backup for Microsoft 365. Veeam services that Veeam Backup for Microsoft 365 installs on the Azure archiver appliance compress data passed through. This helps reduce network traffic and increase the speed of backup copy.

The process of the Azure archiver appliance deployment takes a couple of minutes. If you enable usage of the Azure archiver appliance, Veeam Backup for Microsoft 365 will create the archiver appliance at the beginning of a backup copy job and remove or reuse it after a backup copy job completes. By default, Veeam Backup for Microsoft 365 always keeps one archiver appliance for reuse.

http.FileServer will try to sniff the Content-Type by default if it can't be inferred from file name. To do this, the http package will try to read from the file and then Seek back to file start, which the libray can't achieve currently. The same goes with Range requests. Seeking in archives is not currently supported by archiver due to limitations in dependencies.