Uploaded image for project: 'Solution Center'
  1. Solution Center
  2. SOL-226

Details on DELETE operation

    XMLWordPrintable

    Details

    • Type: Explanation
    • Status: Published
    • Affects Version/s: None
    • Fix Version/s: None
    • Component/s: None
    • Labels:
      None
    • Explanation:
      Hide

      Delete

      The delete operation ensures the removal of data from tables. It works by simply marking the affected rows as deleted, without actually removing them physically from the table. This means that the data is still there, but it is simply ignored by the subsequent queries.

      Performance: Since no data is ever moved around, delete is normally very fast.

      How to determine the percentage of deleted rows

      To see what percentage of data is marked as deleted in a table, please see the DELETE_PERCENTAGE column in EXA_*_TABLES.

      Reorganize

      While delete is very fast, it does present a drawback: data that is no longer needed is still stored leading to some extra memory usage. To limit this drawback, whenever too many rows are deleted, the table is "reorganized" by physically replacing the deleted rows with non-deleted ones. After the reorganize is completed, the table contains no rows that are marked as deleted. Moreover, all the indices of the affected table are dropped and created again.

      Performance: Normally, the reorganize operation is an expensive operation. This is because it typically needs to move a lot of data around to replace the rows marked as deleted with non-deleted rows. It essentially means that the full table is scanned and a lot of write operations take place on each column (proportional with the amount of deleted rows). Furthermore, the fresh index creation adds some extra time.

      Notes

      • While reorganize is an expensive operation, its cost is amortized over many deletes.
      • Reorganize is triggered by default when a quarter of the rows are deleted. This may lead to a perceived decrease in performance when it happens, particularly for small deletes (i.e. deleting a couple of rows takes a long time). To alleviate this problem, it is possible to trigger the reorganize operation explicitly:
      reorganize table t
      
      • By default the reorganize command will not trigger a table reorganize if the number of rows marked as deleted is smaller than 12.5% - in this case the system recognizes that the number of deleted rows is too small and a reorganize is likely to only incur unnecessary performance penalties. To trigger a reorganize even in this case, it needs to be explicitly enforced:
      reorganize table t enforce
      

      Delete versus Reorganize

      The table below summarizes the discussion above

        Delete Reorganize
      Semantics Marks rows as deleted, data is not removed Physically removes data from the tables
      Occurence Whenever data is deleted (delete or merge) When more than 25% of rows are marked as deleted or when explicitly triggered
      Performance Fast Slow, needs to reorganize all the columns and re-create the indices
      Show
      Delete The delete operation ensures the removal of data from tables. It works by simply marking the affected rows as deleted, without actually removing them physically from the table. This means that the data is still there, but it is simply ignored by the subsequent queries. Performance: Since no data is ever moved around, delete is normally very fast. How to determine the percentage of deleted rows To see what percentage of data is marked as deleted in a table, please see the DELETE_PERCENTAGE column in EXA_*_TABLES. Reorganize While delete is very fast, it does present a drawback: data that is no longer needed is still stored leading to some extra memory usage. To limit this drawback, whenever too many rows are deleted, the table is "reorganized" by physically replacing the deleted rows with non-deleted ones. After the reorganize is completed, the table contains no rows that are marked as deleted. Moreover, all the indices of the affected table are dropped and created again. Performance: Normally, the reorganize operation is an expensive operation. This is because it typically needs to move a lot of data around to replace the rows marked as deleted with non-deleted rows. It essentially means that the full table is scanned and a lot of write operations take place on each column (proportional with the amount of deleted rows). Furthermore, the fresh index creation adds some extra time. Notes While reorganize is an expensive operation, its cost is amortized over many deletes. Reorganize is triggered by default when a quarter of the rows are deleted. This may lead to a perceived decrease in performance when it happens, particularly for small deletes (i.e. deleting a couple of rows takes a long time). To alleviate this problem, it is possible to trigger the reorganize operation explicitly: reorganize table t By default the reorganize command will not trigger a table reorganize if the number of rows marked as deleted is smaller than 12.5% - in this case the system recognizes that the number of deleted rows is too small and a reorganize is likely to only incur unnecessary performance penalties. To trigger a reorganize even in this case, it needs to be explicitly enforced: reorganize table t enforce Delete versus Reorganize The table below summarizes the discussion above   Delete Reorganize Semantics Marks rows as deleted, data is not removed Physically removes data from the tables Occurence Whenever data is deleted (delete or merge) When more than 25% of rows are marked as deleted or when explicitly triggered Performance Fast Slow, needs to reorganize all the columns and re-create the indices
    • Solution:
      Hide

      When many deletes are expected, an explicit reorganize of the affected tables should improve performance; for instance, such a pre-emptive reorganize can be done when the system is not under heavy usage.

      Show
      When many deletes are expected, an explicit reorganize of the affected tables should improve performance; for instance, such a pre-emptive reorganize can be done when the system is not under heavy usage.
    • Category 1:
      SQL
    • Category 2:
      Database Administration - Data Organization

      Attachments

        Activity

          People

          • Assignee:
            CaptainEXA Captain EXASOL
            Reporter:
            CaptainEXA Captain EXASOL
          • Votes:
            1 Vote for this issue
            Watchers:
            1 Start watching this issue

            Dates

            • Created:
              Updated: