Deprecated: Function get_magic_quotes_gpc() is deprecated in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 99

Deprecated: The each() function is deprecated. This message will be suppressed on further calls in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 619

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1169

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176

Warning: Cannot modify header information - headers already sent by (output started at /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php:99) in /hermes/walnacweb04/walnacweb04ab/b2791/pow.jasaeld/htdocs/De1337/nothing/index.php on line 1176
8000 Improve sort buffer sizing heuristics and honor explicit --buffer-size by mattsu2020 · Pull Request #8833 · uutils/coreutils · GitHub
Nothing Special   »   [go: up one dir, main page]

Skip to content

Conversation

mattsu2020
Copy link
Contributor

Add automatic buffer-size heuristics (ported from commit a0e77d9). We now size external-sort chunks based on input file sizes and available memory, clamping to 512 KiB–128 MiB so we avoid both tiny buffers and risky overcommit on constrained systems.

Respect user-provided --buffer-size. Only automatically computed sizes are raised to the safety minimum; explicit values are left untouched, which keeps external sorting and --compress-program working even when users choose small buffers.

Performance Comparison (baseline vs. current)
Measurements come from hyperfine --warmup 3 --runs 10; values are means in milliseconds (lower is better).

Scenario Baseline Current Delta Speedup
ASCII 500k 17.04 14.59 -2.45 1.17×
Numeric 500k 36.89 37.29 +0.40 0.99×
ASCII 4M 112.47 107.52 -4.95 1.05×
ASCII 4M (-S 32M) 207.40 100.48 -106.92 2.06×
ASCII 16M 854.18 414.45 -439.73 2.06×
ASCII 16M (-S 512M) 838.05 426.85 -411.20 1.96×

Copy link
github-actions bot commented Oct 6, 2025

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

Copy link
github-actions bot commented Oct 7, 2025

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)

Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)

@sylvestre
Copy link
Contributor

you have to refresh the fuzz/Cargo.lock file

Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)

@sylvestre
Copy link
Contributor

please follow this documentation for the performance work:
https://github.com/uutils/coreutils/blob/main/docs/src/performance.md

We would like to see hyperfine results and codspeed benchmark :)
thanks

quarter.max(max)
}

#[cfg(target_os = "linux")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i was already not convinced in #8802
but
there is probably a better way than parsing /proc/meminfo

esp in the sort.rs code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented it using a different method

10000

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I said it in the other pr but it does not belong to sort but uucore
And maybe we already have such functions

Copy link
Contributor Author
@mattsu2020 mattsu2020 Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it total_physical_memory? Does it perform the same processing?
Is it acceptable to modify uucore to add available memory?

Copy link
codspeed-hq bot commented Oct 11, 2025

CodSpeed Performance Report

Merging #8833 will degrade performances by 3.04%

Comparing mattsu2020:sort_performan-ce (9914a22) with main (905629b)

Summary

⚡ 16 improvements
❌ 3 regressions
✅ 87 untouched
⏩ 73 skipped1

⚠️ Please fix the performance issues or acknowledge them on CodSpeed.

Benchmarks breakdown

Benchmark BASE HEAD Change
sort_accented_data[500000] 828.6 ms 361.8 ms ×2.3
sort_ascii_only[500000] 807.3 ms 352.9 ms ×2.3
sort_case_insensitive[500000] 427.6 ms 282.6 ms +51.3%
sort_case_sensitive[500000] 337.6 ms 173.7 ms +94.35%
sort_key_field[500000] 697.9 ms 717 ms -2.67%
sort_long_line[160000] 1.8 ms 1.9 ms -3.04%
sort_mixed_data[500000] 740.1 ms 327.3 ms ×2.3
sort_numeric[500000] 1.2 s 1.2 s -2.51%
sort_reverse_locale[500000] 830 ms 361 ms ×2.3
sort_unique_locale[500000] 1,151.6 ms 482.4 ms ×2.4
sort_ascii_c_locale 29.4 ms 22.9 ms +28.31%
sort_ascii_utf8_locale 59.2 ms 46.2 ms +28.26%
sort_german_c_locale 95.4 ms 38.8 ms ×2.5
sort_german_locale 95.4 ms 39 ms ×2.4
sort_mixed_c_locale 94.7 ms 38.6 ms ×2.5
sort_mixed_utf8_locale 95 ms 38.7 ms ×2.5
sort_random_strings 54.6 ms 30.3 ms +80.52%
sort_reverse_mixed 94.2 ms 38.3 ms × 8000 2.5
sort_unique_mixed 88.5 ms 39.2 ms ×2.3

Footnotes

  1. 73 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

@sylvestre
Copy link
Contributor

impressive wins :)

Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

@sylvestre
Copy link
Contributor

i am planning to merge #8746
because:

Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/misc/tee (fails in this run but passes in the 'main' branch)
Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

@mattsu2020 mattsu2020 requested a review from sylvestre October 13, 2025 01:15
@sylvestre
Copy link
Contributor

please try to avoid commit messages like "fix"
it doesn't help reviewers

use fnv::FnvHasher;
#[cfg(target_os = "linux")]
use nix::libc::{RLIMIT_NOFILE, getrlimit, rlimit};
use libc::{RLIMIT_NOFILE, rlimit};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it was more here. why do you move to libc ?
what was wrong with nix ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I was using libc elsewhere, I thought it would be better to match that, but the original code is better.

Copy link

GNU testsuite comparison:

Skip an intermittent issue tests/tail/overlay-headers (fails in this run but passes in the 'main' branch)
Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

Copy link
Contributor Author
@mattsu2020 mattsu2020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

・Add comment
・Change to Nix

quarter.max(max)
}

#[cfg(target_os = "linux")]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented it using a different method

use fnv::FnvHasher;
#[cfg(target_os = "linux")]
use nix::libc::{RLIMIT_NOFILE, getrlimit, rlimit};
use libc::{RLIMIT_NOFILE, rlimit};
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I was using libc elsewhere, I thought it would be better to match that, but the original code is better.

src/uu/sort/src/sort.rs Outdated Show resolved Hide resolved
@sylvestre
Copy link
Contributor

I have doubt about the memory functions being in sort itself. Should be in uucore if we already don't have any.

Also please keep in mind that you are writing to a human. If I want an Ai explaining me things, I can do it myself :)

Copy link
Contributor Author
@mattsu2020 mattsu2020 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added comments to the source code.

quarter.max(max)
}

#[cfg(target_os = "linux")]
Copy link
Contributor Author
@mattsu2020 mattsu2020 Oct 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it total_physical_memory? Does it perform the same processing?
Is it acceptable to modify uucore to add available memory?

Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

Copy link
Contributor Author

move memory functions to uucore

Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

// Heuristically chosen: Dividing by 10 seems to keep our memory usage roughly
// around settings.buffer_size as a whole.
let buffer_size = settings.buffer_size / 10;
// Cap oversized buffer requests at 512MiB to avoid unnecessary allocations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please explain why 512 :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cap automatic_buffer_size at 1 GiB so the reader and sorter never hold more than about 1 GiB of data at once; without that cap they could grab several gigabytes and waste memory.
Since it becomes 1GB when allocated simultaneously, I set it to 512MB.

Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

tempfile = { workspace = true }

[target.'cfg(target_os = "linux")'.dependencies]
procfs = "0.18"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the one from workspace

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#8968
fix PR

@sylvestre
Copy link
Contributor

Looks great.
Now, I don't want the merge these 54 commits or squash into one.
Could you please clean it up? First, uucore, then memory functions and the rest

@mattsu2020
Copy link
Contributor Author

Looks great. Now, I don't want the merge these 54 commits or squash into one. Could you please clean it up? First, uucore, then memory functions and the rest

Fitst uucore change
#8951

Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)

@mattsu2020
Copy link
Contributor Author

Looks great. Now, I don't want the merge these 54 commits or squash into one. Could you please clean it up? First, uucore, then memory functions and the rest

Fitst uucore change #8951
Second memory function change
#8959

}
}

fn file_size_hint(files: &[OsString]) -> Option<usize> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please document this
and maybe it would make sense to move these memory functions into a specific file

fn physical_memory_bytes() -> Option<u128> {
#[cfg(all(target_family = "unix", not(target_os = "redox")))]
{
let pages = unsafe { libc::sysconf(libc::_SC_PHYS_PAGES) };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use nix instead of libc here?
it will probably remove unsafe

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[target.'cfg(target_os = "linux")'.dependencies]

Is it okay to modify this part too?
Since it depends on Linux.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe ? please give it a try :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#8959 (comment)
In some environments, it seems the environment variables are missing.

Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)

1 similar comment
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/misc/tee (passes in this run but fails in the 'main' branch)

Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

- move heuristics into a new buffer_hint module and default to automatic sizing when the buffer flag is absent

- tune chunk and external sort buffers to avoid runaway allocations

- add fast lexicographic and ASCII case-insensitive comparisons for the default mode

- refresh spell-check and dependency metadata for the new code
- keep the latest path/lock pair in a shared registry so SIGINT always cleans the active directory

- guard handler installation with an atomic flag and reset state when the wrapper is dropped
Copy link

GNU testsuite comparison:

Skipping an intermittent issue tests/tail/overlay-headers (passes in this run but fails in the 'main' branch)
Skipping an intermittent issue tests/timeout/timeout (passes in this run but fails in the 'main' branch)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants

0