Commit Graph

63 Commits

Author SHA1 Message Date
Lisandro Pin ba1d82db90 Move shell.ErrorWaitGroup into a common file, to cleanly reuse across weed shell commands. (#6780)
Move `shell.ErrorWaitGroup` into a dedicated common file, to cleanly reuse across `weed shell` commands.
2025-05-12 14:38:55 -07:00
Lisandro Pin 848d1f7c34 Improve safety for weed shell's ec.encode. (#6773)
Improve safety for weed shells `ec.encode`.

The current process for `ec.encode` is:

1. EC shards for a volume are generated and added to a single server
2. The original volume is deleted
3. EC shards get re-balanced across the entire topology

It is then possible to lose data between #2 and #3, if the underlying volume storage/server/rack/DC
happens to fail, for whatever reason. As a fix, this MR reworks `ec.encode` so:

  * Newly created EC shards are spread across all locations for the source volume.
  * Source volumes are deleted only after EC shards are converted and balanced.
2025-05-09 09:01:32 -07:00
Lisandro Pin c07596691c ec.encode: Fix resolution of target collections. (#6585)
* Don't ignore empty (`""`) collection names when computing collections for a given volume ID.

* `ec.encode`: Fix resolution of target collections.

When no `volumeId` parameter is provided, compute volumes
based on the provided collection name, even if it's empty (`""`).

This restores behavior to before recent EC rebalancing rework. See also
https://github.com/seaweedfs/seaweedfs/blob/ec30a504bae6cad75f859964e14c60d39cc43709/weed/shell/command_ec_encode.go#L99 .
2025-02-28 11:42:19 -08:00
Lisandro Pin 76a111f0a2 Fix calculation of node's free EC shard slots. (#6584) 2025-02-28 07:35:28 -08:00
Lisandro Pin e8d8bfcccc Nit: remove missing newlines on weed shell commands output. (#6524)
Nit: remove missing newlines on `weed` commands output.
2025-02-07 10:27:04 -08:00
Lisandro Pin 29c2d9b965 Remove warning on EC balancing if no replica placement settings are found. (#6516)
Effectively undoes c9399a68; with ff8bd862, a replica placement type `000`
will no longer break shards re-balancing.

Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
2025-02-06 09:19:28 -08:00
Lisandro Pin 68f547bdf2 Nit: fix missing newline on EC balancing warnings regarding replica settings (#6509)
Nit: fix missing newline on EC balancing warnings regarding replica settings.

See 79136812.
2025-02-04 10:59:25 -08:00
Lisandro Pin 331c1f0f3f Improve EC shards balancing logic regarding replica placement settings. (#6491)
The replica placement type specifies numebr of _replicas_ on the same/different rack;
that means we can have one EC shard copy on each, even if the replica setting is zero.

This PR reworks replica placement parsing for EC rebalancing, so we check allow
(replica placement + 1) when selecting racks and nodes to balance EC shards into.
2025-01-30 09:26:45 -08:00
Lisandro Pin 250fbbb3db ec.balance: Allow EC balancing without collections. (#6488) 2025-01-29 08:51:59 -08:00
Lisandro Pin 7913681297 ec.encode: Display a warning on EC balancing if no replica placement settings are found. (#6487) 2025-01-29 08:50:19 -08:00
chrislu ec155022e7 "golang.org/x/exp/slices" => "slices" and go fmt 2024-12-19 19:25:06 -08:00
Lisandro Pin ba0707af64 Allow configuring the maximum number of concurrent tasks for EC parallelization. (#6376)
Follow-up to b0210df0.
2024-12-18 13:26:26 -08:00
Lisandro Pin 9fbc4ea417 Rework shell.EcBalance()'s waitgroup code into a standalone type. (#6373)
Rework `shell.EcBalance()`'s waitgroup with errors code into a standalone type.

We'll re-use this for other EC jobs - for example, volume creation. Also fixes
potential concurrency issues when collecting error results.
2024-12-17 09:39:51 -08:00
Lisandro Pin 9b48ce0613 Parallelize EC shards balancing within racks (#6354)
Parallelize EC shards balancing within racks.
2024-12-15 13:36:23 -08:00
Lisandro Pin 926cfea3dc Parallelize EC shards balancing across racks. (#6352) 2024-12-13 06:05:32 -08:00
Lisandro Pin b81def5e5c Parallelize EC balancing for racks. (#6351) 2024-12-13 05:33:53 -08:00
Lisandro Pin b0210df081 Begin implementing EC balancing parallelization support. (#6342)
* Begin implementing EC balancing parallelization support.

Impacts both `ec.encode` and `ec.balance`,

* Nit: improve type naming.

* Make the goroutine workgroup handler for `EcBalance()` a bit smarter/error-proof.

* Nit: unify naming for `ecBalancer` wait group methods with the rest of the module.

* Fix concurrency bug.

* Fix whitespace after Gitlab automerge.

* Delete stray TODO.
2024-12-12 09:14:44 -08:00
Lisandro Pin 23ffbb083c Limit EC re-balancing for ec.encode to relevant collections when a volume ID argument is provided. (#6347)
Limit EC re-balancing for `ec.encode` to relevant collections when a volume ID is provided.
2024-12-12 08:41:33 -08:00
Lisandro Pin 8c82c037b9 Unify the re-balancing logic for ec.encode with ec.balance. (#6339)
Among others, this enables recent changes related to topology aware
re-balancing at EC encoding time.
2024-12-10 13:30:13 -08:00
Lisandro Pin 522a25790a Remove average constraints when selecting nodes/racks to balance EC shards into. (#6325) 2024-12-06 09:00:06 -08:00
Lisandro Pin 34cdbdd279 Share common parameters for EC re-balancing functions under a single struct. (#6319)
TODO cleanup for https://github.com/seaweedfs/seaweedfs/discussions/6179.
2024-12-05 09:00:46 -08:00
Lisandro Pin edef485333 Account for replication placement settings when balancing EC shards within the same rack. (#6317)
* Account for replication placement settings when balancing EC shards within racks.

* Update help contents for `ec.balance`.

* Add a few more representative test cases for `pickEcNodeToBalanceShardsInto()`.
2024-12-04 10:47:51 -08:00
Lisandro Pin 351efa134d Account for replication placement settings when balancing EC shards across racks. (#6316) 2024-12-04 09:00:55 -08:00
Lisandro Pin b2ba7d7408 Resolve replica placement for EC volumes from master server defaults. (#6303) 2024-12-02 08:44:07 -08:00
Lisandro Pin 9a741a61b1 Display details upon failures to re-balance EC shards racks. (#6299) 2024-11-28 08:42:41 -08:00
Lisandro Pin 559a1fd0f4 Improve EC shards rebalancing logic across nodes (#6297)
* Improve EC shards rebalancing logic across nodes.

- Favor target nodes with less preexisting shards, to ensure a fair distribution.
- Randomize selection when multiple possible target nodes are available.
- Add logic to account for replication settings when selecting target nodes (currently disabled).

* Fix minor test typo.

* Clarify internal error messages for `pickEcNodeToBalanceShardsInto()`.
2024-11-27 11:51:57 -08:00
chrislu 04081128a9 use math rand v2 2024-11-21 08:54:03 -08:00
Lisandro Pin ca499de1cb Improve EC shards rebalancing logic across racks (#6270)
Improve EC shards rebalancing logic across racks.

  - Favor target shards with less preexisting shards, to ensure a fair distribution.
  - Randomize selection when multiple possible target shards are available.
  - Add logic to account for replication settings when selecting target shards (currently disabled).
2024-11-21 08:46:24 -08:00
Lisandro Pin 0d5393641e Unify usage of shell.EcNode.dc as DataCenterId. (#6258) 2024-11-19 06:33:18 -08:00
Lisandro Pin f2db746690 Introduce logic to resolve volume replica placement within EC rebalancing. (#6254)
* Rename `command_ec_encode_test.go` to `command_ec_common_test.go`.

All tests defined in this file are now for `command_ec_common.go`.

* Minor code cleanups.

- Fix broken `ec.balance` test.
- Rework integer ceiling division to not use floats, which can introduce precision errors.

* Introduce logic to resolve volume replica placement within EC rebalancing.

This will be used to make rebalancing logic topology-aware.

* Give shell.EcNode.dc a dedicated DataCenterId type.
2024-11-18 18:05:06 -08:00
Lisandro Pin efdebf712e Refactor ec.balance logic into a weeed/shell/command_ec_common.go… (#6195)
* Refactor `ec.balance` logic into a `weeed/shell/command_ec_common.go` standalone function.

This is a prerequisite to unify the balance logic for `ec.balance` and `ec.encode'.

* s/Balance()/EcBalance()/g
2024-11-04 17:56:20 -08:00
chrislu 645ae8c57b Revert "Revert "Merge branch 'master' of https://github.com/seaweedfs/seaweedfs""
This reverts commit 8cb42c39
2023-09-25 09:35:16 -07:00
chrislu 8cb42c39ad Revert "Merge branch 'master' of https://github.com/seaweedfs/seaweedfs"
This reverts commit 2e5aa06026, reversing
changes made to 4d414f54a2.
2023-09-18 16:12:50 -07:00
dependabot[bot] a04bd4d26f Bump github.com/rclone/rclone from 1.63.1 to 1.64.0 (#4850)
* Bump github.com/rclone/rclone from 1.63.1 to 1.64.0

Bumps [github.com/rclone/rclone](https://github.com/rclone/rclone) from 1.63.1 to 1.64.0.
- [Release notes](https://github.com/rclone/rclone/releases)
- [Changelog](https://github.com/rclone/rclone/blob/master/RELEASE.md)
- [Commits](https://github.com/rclone/rclone/compare/v1.63.1...v1.64.0)

---
updated-dependencies:
- dependency-name: github.com/rclone/rclone
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

* API changes

* go mod

---------

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Chris Lu <chrislusf@users.noreply.github.com>
Co-authored-by: chrislu <chris.lu@gmail.com>
2023-09-18 14:43:05 -07:00
chrislu f9383aa726 refactor to change capacity data type 2022-10-09 18:58:10 -07:00
Ryan Russell bd2dc6d641 refactor(shell): Decending -> Descending (#3675)
Signed-off-by: Ryan Russell <git@ryanrussell.org>

Signed-off-by: Ryan Russell <git@ryanrussell.org>
2022-09-14 12:06:48 -07:00
chrislu 676e27c589 shell: stop long running jobs if lock is lost 2022-08-22 14:12:23 -07:00
chrislu 26dbc6c905 move to https://github.com/seaweedfs/seaweedfs 2022-07-29 00:17:28 -07:00
justin 3551ca2fcf enhancement: replace sort.Slice with slices.SortFunc to reduce reflection 2022-04-18 10:35:43 +08:00
chrislu 21aaa4c1f1 ec.encode: calculate free ec slots based on (maxVolumeCount-volumeCount)
fix https://github.com/chrislusf/seaweedfs/issues/2642
2022-02-08 01:51:13 -08:00
chrislu f18803424a volume.balance: add delay during tight loop
fix https://github.com/chrislusf/seaweedfs/issues/2637
2022-02-08 00:53:55 -08:00
chrislu 9f9ef1340c use streaming mode for long poll grpc calls
streaming mode would create separate grpc connections for each call.
this is to ensure the long poll connections are properly closed.
2021-12-26 00:15:03 -08:00
Chris Lu e5fc35ed0c change server address from string to a type 2021-09-12 22:47:52 -07:00
Chris Lu 1c233ad986 refactoring 2021-02-22 00:28:42 -08:00
Chris Lu a0c6db361c avoid nil 2021-02-16 05:33:38 -08:00
Chris Lu 36f95e50a9 avoid possible nil disk info 2021-02-16 05:13:48 -08:00
Chris Lu f8446b42ab this can compile now!!! 2021-02-16 02:47:02 -08:00
Chris Lu a595916342 shell: add volumeServer.evacuate command 2020-09-14 23:47:11 -07:00
Chris Lu d15682b4a1 shell: volume.balance plan by ratio of fullness 2020-09-12 04:06:26 -07:00
Chris Lu 892e726eb9 avoid reusing context object
fix https://github.com/chrislusf/seaweedfs/issues/1182
2020-02-25 21:50:12 -08:00