Compare commits

2 Commits

Author SHA1 Message Date
Danilo Reyes
4337a8847c wowaweewa 2026-02-07 06:15:34 -06:00
Danilo Reyes
070a3633d8 init 2026-02-07 06:01:29 -06:00
41 changed files with 2963 additions and 36 deletions

19
.gitignore vendored
View File

@@ -10,3 +10,22 @@
.DS_Store
Thumbs.db
*~
.vscode/
.idea/
*.tmp
*.swp
# Rust build/output
target/
debug/
release/
*.rs.bk
*.rlib
*.prof*
# Node/Svelte build/output
node_modules/
dist/
build/
*.log
.env*

View File

@@ -1,50 +1,168 @@
# [PROJECT_NAME] Constitution
<!-- Example: Spec Constitution, TaskFlow Constitution, etc. -->
<!--
Sync Impact Report
Version change: N/A (template) -> 1.0.0
Modified principles: N/A (initial adoption)
Added sections: Mission & Scope; Conceptual Model; Absolute Safety Rules; Deletion Philosophy;
Required User Workflows; Download List File Integrity; Auditing & Accountability;
Configuration & Boundaries; Documentation Persistence; Definition of Done; Out of Scope
Removed sections: N/A
Templates requiring updates:
- ✅ updated: .specify/templates/plan-template.md
- ✅ updated: .specify/templates/spec-template.md
- ✅ updated: .specify/templates/tasks-template.md
- ⚠️ pending: .specify/templates/commands/*.md (no files found)
Follow-up TODOs:
- TODO(RATIFICATION_DATE): original adoption date not found
-->
# Gallery Archive Curator Constitution
## Core Principles
### [PRINCIPLE_1_NAME]
<!-- Example: I. Library-First -->
[PRINCIPLE_1_DESCRIPTION]
<!-- Example: Every feature starts as a standalone library; Libraries must be self-contained, independently testable, documented; Clear purpose required - no organizational-only libraries -->
### I. Safety Over Convenience (NON-NEGOTIABLE)
All behavior MUST prioritize data preservation over speed or convenience. Whitelisted
user directories MUST never be deletable by any directory-level action. Any destructive
operation MUST be previewed, explicitly confirmed, and audited. A global read-only mode
MUST exist and fully disable mutations. Rationale: irreversible loss is unacceptable.
### [PRINCIPLE_2_NAME]
<!-- Example: II. CLI Interface -->
[PRINCIPLE_2_DESCRIPTION]
<!-- Example: Every library exposes functionality via CLI; Text in/out protocol: stdin/args → stdout, errors → stderr; Support JSON + human-readable formats -->
### II. Explicit, Deterministic Behavior
All behavior MUST be explicit, deterministic, and reproducible under the same
configuration. The system MUST refuse to operate outside configured root paths and
MUST NOT follow symlinks for destructive actions (only the link itself may be removed).
Rationale: predictable, bounded behavior prevents accidental loss.
### [PRINCIPLE_3_NAME]
<!-- Example: III. Test-First (NON-NEGOTIABLE) -->
[PRINCIPLE_3_DESCRIPTION]
<!-- Example: TDD mandatory: Tests written → User approved → Tests fail → Then implement; Red-Green-Refactor cycle strictly enforced -->
### III. Previewed, Confirmed, Reversible Deletion
Deletion MUST be a process, not a single action. The default MUST be two-stage deletion
(move to trash/staging first), with permanent deletion only by explicit configuration and
confirmation. All deletion behavior MUST remain reversible until explicitly finalized.
Rationale: human review requires a safety buffer.
### [PRINCIPLE_4_NAME]
<!-- Example: IV. Integration Testing -->
[PRINCIPLE_4_DESCRIPTION]
<!-- Example: Focus areas requiring integration tests: New library contract tests, Contract changes, Inter-service communication, Shared schemas -->
### IV. Traceability and Auditability
Every mutation MUST produce a durable, append-only audit record with timestamp, action
type, affected paths, list-file changes (if any), and outcome. The UI MUST expose recent
audit activity. Rationale: accountability and recovery depend on traceable history.
### [PRINCIPLE_5_NAME]
<!-- Example: V. Observability, VI. Versioning & Breaking Changes, VII. Simplicity -->
[PRINCIPLE_5_DESCRIPTION]
<!-- Example: Text I/O ensures debuggability; Structured logging required; Or: MAJOR.MINOR.BUILD format; Or: Start simple, YAGNI principles -->
### V. Human-Driven Workflow Clarity
The UI MUST always display the current state of each user directory and support fast,
intentional review flows. Required workflows (whitelisted media triage and untagged
decisioning) MUST surface essential context (owner, size, type, relative path) and
support random and size-prioritized review. Rationale: clear human intent drives safe
decisions.
## [SECTION_2_NAME]
<!-- Example: Additional Constraints, Security Requirements, Performance Standards, etc. -->
## Mission, Scope, and Conceptual Model
[SECTION_2_CONTENT]
<!-- Example: Technology stack requirements, compliance standards, deployment policies, etc. -->
### Mission
Provide a safe, human-driven system for curating large archives of downloaded media
where each directory represents a single user, enabling disk-space recovery while making
accidental data loss extremely difficult.
## [SECTION_3_NAME]
<!-- Example: Development Workflow, Review Process, Quality Gates, etc. -->
### Scope (In/Out)
**In scope**: manual curation, clear confirmation flows, auditing, and reversible deletion.
**Out of scope**: automatic or unattended deletion, machine-learning-based decisions,
silent bulk operations, cloud dependencies, or tight coupling to the scraper.
[SECTION_3_CONTENT]
<!-- Example: Code review requirements, testing gates, deployment approval process, etc. -->
### Conceptual Model
- A user directory is a folder containing media scraped from a single user.
- Each user directory exists in exactly one state: Untagged, Whitelisted, Blacklisted,
or Kept.
- The system MUST maintain a single, explicit source of truth for these states.
- The current state of any user directory MUST always be visible in the UI.
## Operational Safety, Workflows, and Data Integrity
### Absolute Safety Rules (Non-Negotiable)
- Whitelisted directories MUST never be deletable by any directory-level action.
- No destructive operation may occur without: (1) dry-run preview, (2) explicit
confirmation, and (3) persistent audit record.
- All filesystem operations MUST refuse to act outside configured root paths.
- A global read-only mode MUST exist and disable all mutations while allowing browsing.
- Destructive operations MUST be single-writer only and MUST NOT run concurrently.
- Symlinks MUST never be followed for destructive actions; only the link itself may be
removed.
### Deletion Philosophy
- Deletion is a process, not a single action.
- Default behavior MUST be two-stage deletion: move to trash/staging first.
- Permanent deletion MUST require explicit configuration and confirmation.
- Hard deletion MUST never be the silent default.
- All deletion behavior MUST be reversible until explicitly finalized.
### Required User Workflows
**Mode 1 — Whitelisted Media Triage**
- Purpose: reclaim space without risking loss of important users.
- Only individual media files may be deleted; the parent directory MUST remain
protected at all times.
- Media is shown one item at a time for rapid decision-making.
- Each item MUST display: owning user, file size, media type, and relative path.
- Ordering MUST support random review and size-prioritized review.
**Mode 2 — Untagged Directory Decision**
- Purpose: decide whether an entire user is worth keeping.
- A directory is reviewed via a collage of randomly sampled files.
- The sample MUST be refreshable without changing directories.
- Decisions:
- Keep: directory is moved or marked as preserved and removed from the untagged pool.
- Delete: directory is removed only after explicit confirmation and preview.
- When deleting a directory, the system MUST attempt to locate and optionally remove
the user from a plain-text download list file.
- List-file edits MUST be previewed and optional; directory deletion MUST NOT depend on
list-file presence.
### Download List File Integrity
- The download list file is a critical control surface.
- Removal rules MUST be explicit and conservative; default is exact-match removal only.
- All edits MUST be previewed and performed atomically.
- If no matching entry is found, the system MUST clearly state this and proceed safely.
### Auditing and Accountability
- Every mutation MUST produce a durable, append-only audit record.
- Audit records MUST include: timestamp, action type, affected paths, list-file changes
(if any), and outcome.
- Audit logs are core data and MUST never be optional.
- The UI MUST expose recent audit activity for verification.
### Configuration and Boundaries
- All operational paths (pools, whitelist, trash, list file) MUST be explicitly
configured.
- The system MUST refuse to operate with ambiguous or overlapping roots.
- Behavior MUST be deterministic under the same configuration.
- Configuration changes MUST NOT retroactively invalidate safety guarantees.
### Documentation Persistence Rule
Any of the following requires immediate documentation updates:
- Bug fixes affecting behavior or safety
- Edge cases discovered in real data
- Changes in deletion, confirmation, or matching rules
- Changes in directory state handling
- Changes in list-file identity logic
Documentation updates MUST include: what changed, why it changed, and how it affects
existing behavior. Undocumented behavior is a defect.
### Definition of Done
The project is compliant when:
- Whitelisted users are provably protected at the code level.
- All destructive actions are previewed, confirmed, and audited.
- Users can curate large archives without fear of silent loss.
- System behavior remains understandable months later through documentation.
## Governance
<!-- Example: Constitution supersedes all other practices; Amendments require documentation, approval, migration plan -->
This constitution is authoritative and supersedes all other practices. All changes MUST
explicitly verify compliance with the Core Principles and Operational Safety rules.
[GOVERNANCE_RULES]
<!-- Example: All PRs/reviews must verify compliance; Complexity must be justified; Use [GUIDANCE_FILE] for runtime development guidance -->
**Amendment Procedure**
- Any amendment MUST update this document, include rationale, and record the impact in
the Sync Impact Report.
- Any change affecting filesystem mutations, deletion workflows, or list-file rules MUST
include a migration or safety review plan.
- Compliance review is required for every change that touches destructive operations or
state handling.
**Version**: [CONSTITUTION_VERSION] | **Ratified**: [RATIFICATION_DATE] | **Last Amended**: [LAST_AMENDED_DATE]
<!-- Example: Version: 2.1.1 | Ratified: 2025-06-13 | Last Amended: 2025-07-16 -->
**Versioning Policy (Semantic)**
- MAJOR: backward-incompatible governance/principle removals or redefinitions.
- MINOR: new principle/section added or materially expanded guidance.
- PATCH: clarifications, wording fixes, or non-semantic refinements.
**Compliance Expectations**
- All plans and specs MUST include a constitution check before implementation.
- Any deviation requires explicit justification and documented alternatives.
**Version**: 1.0.0 | **Ratified**: TODO(RATIFICATION_DATE): original adoption date not found | **Last Amended**: 2026-02-07

View File

@@ -31,7 +31,14 @@
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
[Gates determined based on constitution file]
- Whitelisted directories remain protected from any directory-level delete
- All destructive actions include dry-run preview, explicit confirmation, and audit log
- Read-only mode disables all mutations while allowing browsing
- Destructive operations are single-writer and never concurrent
- Operations are bounded to configured roots; destructive ops never follow symlinks
- Default deletion is two-stage (trash/staging); hard delete is explicit + confirmed
- List-file edits are previewed, optional, atomic; exact-match removal by default
- UI surfaces directory state and recent audit activity for verification
## Project Structure

View File

@@ -95,6 +95,16 @@
- **FR-006**: System MUST authenticate users via [NEEDS CLARIFICATION: auth method not specified - email/password, SSO, OAuth?]
- **FR-007**: System MUST retain user data for [NEEDS CLARIFICATION: retention period not specified]
### Safety & Data Preservation Requirements *(mandatory for destructive actions)*
- **SR-001**: System MUST provide a dry-run preview for destructive actions
- **SR-002**: System MUST require explicit confirmation before destructive actions
- **SR-003**: System MUST append an audit record for every mutation
- **SR-004**: System MUST refuse to act outside configured root paths
- **SR-005**: System MUST NOT follow symlinks for destructive actions
- **SR-006**: System MUST provide a global read-only mode that disables mutations
- **SR-007**: System MUST default to two-stage deletion (trash/staging) unless explicitly configured
### Key Entities *(include if feature involves data)*
- **[Entity 1]**: [What it represents, key attributes without implementation]

View File

@@ -73,6 +73,24 @@ Examples of foundational tasks (adjust based on your project):
---
## Phase 2.5: Safety & Compliance (Mandatory for destructive operations)
**Purpose**: Enforce constitution safety guarantees before any deletion work
- [ ] T009A Implement global read-only mode that blocks all mutations
- [ ] T009B Enforce root-path boundaries for all filesystem operations
- [ ] T009C Implement single-writer guard for destructive operations
- [ ] T009D Implement dry-run preview + explicit confirmation flow for deletion
- [ ] T009E Implement two-stage deletion (trash/staging) as default behavior
- [ ] T009F Enforce symlink-safe deletion (do not follow symlinks)
- [ ] T009G Append-only audit log with required fields for every mutation
- [ ] T009H Enforce whitelist protection for directory-level actions
- [ ] T009I Implement list-file edit preview + atomic write (exact-match default)
**Checkpoint**: Safety guarantees verified - destructive workflows can now begin
---
## Phase 3: User Story 1 - [Title] (Priority: P1) 🎯 MVP
**Goal**: [Brief description of what this story delivers]

31
AGENTS.md Normal file
View File

@@ -0,0 +1,31 @@
# gallery-organizer-web Development Guidelines
Auto-generated from all feature plans. Last updated: 2026-02-07
## Active Technologies
- Local durable state store (SQLite) + append-only audit log file (001-archive-curator)
- Rust (stable toolchain) + Web API framework (Axum), UI framework (SvelteKit), OpenAPI tooling (001-archive-curator)
## Project Structure
```text
src/
tests/
```
## Commands
cargo test [ONLY COMMANDS FOR ACTIVE TECHNOLOGIES][ONLY COMMANDS FOR ACTIVE TECHNOLOGIES] cargo clippy
## Code Style
Rust (stable toolchain): Follow standard conventions
## Recent Changes
- 001-archive-curator: Added Rust (stable toolchain) + Web API framework (Axum), UI framework (SvelteKit), OpenAPI tooling
- 001-archive-curator: Added Rust (stable toolchain) + Web API framework (Axum), UI framework (SvelteKit), OpenAPI tooling
<!-- MANUAL ADDITIONS START -->
<!-- MANUAL ADDITIONS END -->

17
README.md Normal file
View File

@@ -0,0 +1,17 @@
# Archive Curator
## Local Run Notes
- Backend: Rust (Axum) service in `backend/`
- Frontend: Svelte-based UI in `frontend/`
### Planned Commands
- Backend tests: `cargo test`
- Backend lint: `cargo clippy`
- Frontend scripts: `npm run dev` / `npm run build`
### Safety Defaults
This project is designed for local-only operation with strict safety gates:
read-only mode, preview/confirm workflows, and append-only audit logging.

18
backend/Cargo.toml Normal file
View File

@@ -0,0 +1,18 @@
[package]
name = "archive-curator-backend"
version = "0.1.0"
edition = "2021"
[dependencies]
axum = "0.7"
chrono = { version = "0.4", features = ["serde"] }
serde = { version = "1", features = ["derive"] }
serde_json = "1"
sqlx = { version = "0.7", features = ["runtime-tokio", "sqlite", "macros"] }
tokio = { version = "1", features = ["macros", "rt-multi-thread"] }
tracing = "0.1"
tracing-subscriber = { version = "0.3", features = ["env-filter", "fmt"] }
thiserror = "1"
anyhow = "1"
uuid = { version = "1", features = ["v4", "serde"] }
rand = "0.8"

1
backend/rustfmt.toml Normal file
View File

@@ -0,0 +1 @@
edition = "2021"

2
backend/src/api/mod.rs Normal file
View File

@@ -0,0 +1,2 @@
pub mod untagged;
pub mod untagged_delete;

145
backend/src/api/untagged.rs Normal file
View File

@@ -0,0 +1,145 @@
use std::fs;
use std::path::Path;
use axum::{
extract::{Path as AxumPath, State},
http::StatusCode,
response::IntoResponse,
routing::{get, post},
Json, Router,
};
use serde::{Deserialize, Serialize};
use crate::services::collage_sampler::MediaItem;
use crate::state::AppState;
#[derive(Debug, Serialize, Deserialize)]
pub struct UntaggedCollage {
pub directory_id: String,
pub directory_name: String,
pub total_size_bytes: u64,
pub file_count: u64,
pub samples: Vec<MediaItem>,
}
#[derive(Debug, Serialize, Deserialize)]
pub struct DecisionResult {
pub outcome: String,
pub audit_entry_id: String,
}
pub fn router(state: AppState) -> Router {
Router::new()
.route("/directories/untagged/next", get(next_untagged))
.route(
"/directories/untagged/:directory_id/resample",
post(resample_collage),
)
.route(
"/directories/untagged/:directory_id/keep",
post(keep_directory),
)
.with_state(state)
}
async fn next_untagged(State(state): State<AppState>) -> Result<Json<UntaggedCollage>, StatusCode> {
let directory = state
.untagged_queue
.next_directory()
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?
.ok_or(StatusCode::NOT_FOUND)?;
let samples = state
.collage_sampler
.sample(&directory.id, &directory.absolute_path, 12)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
Ok(Json(UntaggedCollage {
directory_id: directory.id,
directory_name: directory.name,
total_size_bytes: directory.total_size_bytes,
file_count: directory.file_count,
samples,
}))
}
async fn resample_collage(
State(state): State<AppState>,
AxumPath(directory_id): AxumPath<String>,
) -> Result<Json<UntaggedCollage>, StatusCode> {
let directory_path = state
.untagged_queue
.resolve_directory(&directory_id)
.map_err(|_| StatusCode::BAD_REQUEST)?;
let directory_name = directory_path
.file_name()
.and_then(|n| n.to_str())
.unwrap_or_default()
.to_string();
let (total_size_bytes, file_count) = dir_stats(&directory_path)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
let samples = state
.collage_sampler
.sample(&directory_id, &directory_path, 12)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
Ok(Json(UntaggedCollage {
directory_id,
directory_name,
total_size_bytes,
file_count,
samples,
}))
}
async fn keep_directory(
State(state): State<AppState>,
AxumPath(directory_id): AxumPath<String>,
) -> Result<impl IntoResponse, StatusCode> {
state
.read_only
.ensure_writable()
.map_err(|_| StatusCode::CONFLICT)?;
let directory_path = state
.untagged_queue
.resolve_directory(&directory_id)
.map_err(|_| StatusCode::BAD_REQUEST)?;
let destination = state
.ops
.keep_directory(&directory_path)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
let entry = state
.audit_log
.append_mutation(
"keep_directory",
vec![directory_path.display().to_string(), destination.display().to_string()],
Vec::new(),
"ok",
None,
)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
Ok(Json(DecisionResult {
outcome: "kept".to_string(),
audit_entry_id: entry.id.to_string(),
}))
}
fn dir_stats(path: &Path) -> std::io::Result<(u64, u64)> {
let mut total_size = 0u64;
let mut file_count = 0u64;
for entry in fs::read_dir(path)? {
let entry = entry?;
let meta = entry.metadata()?;
if meta.is_dir() {
let (size, count) = dir_stats(&entry.path())?;
total_size += size;
file_count += count;
} else if meta.is_file() {
total_size += meta.len();
file_count += 1;
}
}
Ok((total_size, file_count))
}

View File

@@ -0,0 +1,153 @@
use axum::{
extract::{Path as AxumPath, State},
http::StatusCode,
routing::post,
Json, Router,
};
use serde::{Deserialize, Serialize};
use uuid::Uuid;
use crate::services::list_file::{apply_removals_atomic, load_entries, match_entries, preview_removals};
use crate::services::preview_action::{PreviewAction, PreviewActionType};
use crate::state::AppState;
#[derive(Debug, Serialize, Deserialize)]
pub struct DeletePreview {
pub preview_id: String,
pub target_paths: Vec<String>,
pub list_file_changes_preview: Vec<String>,
pub can_proceed: bool,
pub read_only_mode: bool,
}
#[derive(Debug, Serialize, Deserialize)]
pub struct DeleteConfirm {
pub preview_id: String,
pub remove_from_list_file: bool,
#[serde(default)]
pub selected_matches: Option<Vec<String>>,
}
#[derive(Debug, Serialize, Deserialize)]
pub struct DecisionResult {
pub outcome: String,
pub audit_entry_id: String,
}
pub fn router(state: AppState) -> Router {
Router::new()
.route(
"/directories/untagged/:directory_id/preview-delete",
post(preview_delete),
)
.route(
"/directories/untagged/:directory_id/confirm-delete",
post(confirm_delete),
)
.with_state(state)
}
async fn preview_delete(
State(state): State<AppState>,
AxumPath(directory_id): AxumPath<String>,
) -> Result<Json<DeletePreview>, StatusCode> {
state
.read_only
.ensure_writable()
.map_err(|_| StatusCode::CONFLICT)?;
let directory_path = state
.untagged_queue
.resolve_directory(&directory_id)
.map_err(|_| StatusCode::BAD_REQUEST)?;
let directory_name = directory_path
.file_name()
.and_then(|n| n.to_str())
.unwrap_or_default()
.to_string();
let mut entries = load_entries(&state.config.download_list_path)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
let _ = match_entries(&mut entries, &[directory_name]);
let (_remaining, removed) = preview_removals(&entries);
let action = PreviewAction::new(
PreviewActionType::DirectoryDelete,
vec![directory_path.display().to_string()],
removed.clone(),
);
let action = state.preview_store.create(action);
Ok(Json(DeletePreview {
preview_id: action.id.to_string(),
target_paths: action.target_paths,
list_file_changes_preview: action.list_file_changes_preview,
can_proceed: true,
read_only_mode: false,
}))
}
async fn confirm_delete(
State(state): State<AppState>,
AxumPath(directory_id): AxumPath<String>,
Json(payload): Json<DeleteConfirm>,
) -> Result<Json<DecisionResult>, StatusCode> {
state
.read_only
.ensure_writable()
.map_err(|_| StatusCode::CONFLICT)?;
let preview_id = Uuid::parse_str(&payload.preview_id)
.map_err(|_| StatusCode::BAD_REQUEST)?;
let _preview = state
.preview_store
.confirm(preview_id)
.map_err(|_| StatusCode::BAD_REQUEST)?;
let directory_path = state
.untagged_queue
.resolve_directory(&directory_id)
.map_err(|_| StatusCode::BAD_REQUEST)?;
let mut list_file_changes = Vec::new();
if payload.remove_from_list_file {
let selected = payload
.selected_matches
.unwrap_or_else(|| {
directory_path
.file_name()
.and_then(|n| n.to_str())
.map(|s| vec![s.to_string()])
.unwrap_or_default()
});
let mut entries = load_entries(&state.config.download_list_path)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
let _ = match_entries(&mut entries, &selected);
let (remaining, removed) = preview_removals(&entries);
apply_removals_atomic(&state.config.download_list_path, &remaining)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
list_file_changes = removed;
}
let staged = state
.ops
.confirm_delete_directory(&directory_path, false, true)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
let entry = state
.audit_log
.append_mutation(
"delete_directory",
vec![directory_path.display().to_string()],
list_file_changes.clone(),
"ok",
Some(preview_id),
)
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
let outcome = if staged.is_some() { "staged" } else { "deleted" };
Ok(Json(DecisionResult {
outcome: outcome.to_string(),
audit_entry_id: entry.id.to_string(),
}))
}

126
backend/src/config.rs Normal file
View File

@@ -0,0 +1,126 @@
use std::path::{Component, Path, PathBuf};
use serde::{Deserialize, Serialize};
use crate::error::{AppError, AppResult};
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct Config {
pub untagged_root: PathBuf,
pub whitelisted_root: PathBuf,
pub kept_root: PathBuf,
pub trash_root: PathBuf,
pub download_list_path: PathBuf,
pub audit_log_path: PathBuf,
pub state_db_path: PathBuf,
pub read_only_mode: bool,
pub hard_delete_enabled: bool,
pub excluded_patterns: Vec<String>,
}
impl Config {
pub fn from_env() -> AppResult<Self> {
let untagged_root = env_path("UNTAGGED_ROOT")?;
let whitelisted_root = env_path("WHITELISTED_ROOT")?;
let kept_root = env_path("KEPT_ROOT")?;
let trash_root = env_path("TRASH_ROOT")?;
let download_list_path = env_path("DOWNLOAD_LIST_PATH")?;
let audit_log_path = env_path("AUDIT_LOG_PATH")?;
let state_db_path = env_path("STATE_DB_PATH")?;
let read_only_mode = env_bool("READ_ONLY_MODE")?;
let hard_delete_enabled = env_bool("HARD_DELETE_ENABLED")?;
let excluded_patterns = std::env::var("EXCLUDED_PATTERNS")
.ok()
.map(|v| {
v.split(',')
.map(|s| s.trim().to_string())
.filter(|s| !s.is_empty())
.collect()
})
.unwrap_or_default();
let config = Self {
untagged_root,
whitelisted_root,
kept_root,
trash_root,
download_list_path,
audit_log_path,
state_db_path,
read_only_mode,
hard_delete_enabled,
excluded_patterns,
};
config.validate()?;
Ok(config)
}
pub fn validate(&self) -> AppResult<()> {
let roots = [
("untagged_root", &self.untagged_root),
("whitelisted_root", &self.whitelisted_root),
("kept_root", &self.kept_root),
("trash_root", &self.trash_root),
];
for (name, root) in roots.iter() {
if !root.is_absolute() {
return Err(AppError::InvalidConfig(format!(
"{name} must be an absolute path"
)));
}
}
validate_non_overlapping_roots(&roots)?;
Ok(())
}
}
pub fn validate_non_overlapping_roots(roots: &[(&str, &PathBuf)]) -> AppResult<()> {
let mut normalized = Vec::with_capacity(roots.len());
for (name, root) in roots.iter() {
let cleaned = normalize_path(root);
normalized.push(((*name).to_string(), cleaned));
}
for i in 0..normalized.len() {
for j in (i + 1)..normalized.len() {
let (name_a, path_a) = &normalized[i];
let (name_b, path_b) = &normalized[j];
if path_a == path_b {
return Err(AppError::InvalidConfig(format!(
"{name_a} and {name_b} must be different"
)));
}
if path_a.starts_with(path_b) || path_b.starts_with(path_a) {
return Err(AppError::InvalidConfig(format!(
"{name_a} and {name_b} must not overlap"
)));
}
}
}
Ok(())
}
fn normalize_path(path: &Path) -> PathBuf {
let mut out = PathBuf::new();
for component in path.components() {
match component {
Component::CurDir => {}
Component::ParentDir => {
out.pop();
}
Component::RootDir | Component::Prefix(_) => out.push(component.as_os_str()),
Component::Normal(_) => out.push(component.as_os_str()),
}
}
out
}
fn env_path(key: &str) -> AppResult<PathBuf> {
let value = std::env::var(key)
.map_err(|_| AppError::InvalidConfig(format!("{key} is required")))?;
Ok(PathBuf::from(value))
}
fn env_bool(key: &str) -> AppResult<bool> {
let value = std::env::var(key)
.map_err(|_| AppError::InvalidConfig(format!("{key} is required")))?;
Ok(matches!(value.as_str(), "1" | "true" | "TRUE" | "yes" | "YES"))
}

21
backend/src/error.rs Normal file
View File

@@ -0,0 +1,21 @@
use thiserror::Error;
#[derive(Debug, Error)]
pub enum AppError {
#[error("invalid configuration: {0}")]
InvalidConfig(String),
#[error("read-only mode enabled")]
ReadOnly,
#[error("path outside configured roots: {0}")]
PathViolation(String),
#[error("whitelisted directory protected: {0}")]
WhitelistProtected(String),
#[error("io error: {0}")]
Io(#[from] std::io::Error),
#[error("serde json error: {0}")]
SerdeJson(#[from] serde_json::Error),
#[error("sqlx error: {0}")]
Sqlx(#[from] sqlx::Error),
}
pub type AppResult<T> = Result<T, AppError>;

50
backend/src/main.rs Normal file
View File

@@ -0,0 +1,50 @@
mod api;
mod config;
mod error;
mod services;
mod state;
use std::net::{IpAddr, SocketAddr};
use axum::{routing::get, Router};
use tracing_subscriber::{layer::SubscriberExt, util::SubscriberInitExt};
use crate::config::Config;
use crate::state::AppState;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
tracing_subscriber::registry()
.with(tracing_subscriber::EnvFilter::from_default_env())
.with(tracing_subscriber::fmt::layer())
.init();
let bind_addr = std::env::var("BIND_ADDR").unwrap_or_else(|_| "127.0.0.1:8080".to_string());
let socket_addr: SocketAddr = bind_addr.parse()?;
if !is_local_network(socket_addr.ip()) {
return Err("bind address must be loopback or private network".into());
}
let config = Config::from_env()?;
let state = AppState::new(config)?;
let app = Router::new()
.route("/health", get(|| async { "OK" }))
.merge(api::untagged::router(state.clone()))
.merge(api::untagged_delete::router(state.clone()));
tracing::info!("listening on {}", socket_addr);
let listener = tokio::net::TcpListener::bind(socket_addr).await?;
axum::serve(listener, app).await?;
Ok(())
}
fn is_local_network(ip: IpAddr) -> bool {
match ip {
IpAddr::V4(v4) => v4.is_loopback()
|| v4.is_private()
|| v4.is_link_local()
|| v4.is_shared(),
IpAddr::V6(v6) => v6.is_loopback() || v6.is_unique_local() || v6.is_unicast_link_local(),
}
}

View File

@@ -0,0 +1,62 @@
use std::fs::OpenOptions;
use std::io::Write;
use std::path::PathBuf;
use chrono::{DateTime, Utc};
use serde::{Deserialize, Serialize};
use uuid::Uuid;
use crate::error::AppResult;
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct AuditEntry {
pub id: Uuid,
pub timestamp: DateTime<Utc>,
pub action_type: String,
pub affected_paths: Vec<String>,
pub list_file_changes: Vec<String>,
pub outcome: String,
pub preview_id: Option<Uuid>,
}
#[derive(Clone)]
pub struct AuditLog {
path: PathBuf,
}
impl AuditLog {
pub fn new(path: PathBuf) -> Self {
Self { path }
}
pub fn append(&self, entry: &AuditEntry) -> AppResult<()> {
let mut file = OpenOptions::new()
.create(true)
.append(true)
.open(&self.path)?;
let line = serde_json::to_string(entry)?;
writeln!(file, "{line}")?;
Ok(())
}
pub fn append_mutation(
&self,
action_type: &str,
affected_paths: Vec<String>,
list_file_changes: Vec<String>,
outcome: &str,
preview_id: Option<Uuid>,
) -> AppResult<AuditEntry> {
let entry = AuditEntry {
id: Uuid::new_v4(),
timestamp: Utc::now(),
action_type: action_type.to_string(),
affected_paths,
list_file_changes,
outcome: outcome.to_string(),
preview_id,
};
self.append(&entry)?;
Ok(entry)
}
}

View File

@@ -0,0 +1,83 @@
use std::fs;
use std::path::{Path, PathBuf};
use rand::seq::SliceRandom;
use rand::thread_rng;
use serde::{Deserialize, Serialize};
use uuid::Uuid;
use crate::error::AppResult;
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct MediaItem {
pub id: String,
pub user_directory_id: String,
pub relative_path: String,
pub size_bytes: u64,
pub media_type: String,
}
#[derive(Clone, Default)]
pub struct CollageSampler;
impl CollageSampler {
pub fn sample(&self, directory_id: &str, directory: &Path, count: usize) -> AppResult<Vec<MediaItem>> {
let mut files = Vec::new();
collect_media_files(directory, &mut files)?;
let mut rng = thread_rng();
files.shuffle(&mut rng);
let samples = files.into_iter().take(count).map(|path| {
let relative_path = path
.strip_prefix(directory)
.unwrap_or(&path)
.to_string_lossy()
.to_string();
let size_bytes = fs::metadata(&path).map(|m| m.len()).unwrap_or(0);
let media_type = media_type_for(&path);
MediaItem {
id: Uuid::new_v4().to_string(),
user_directory_id: directory_id.to_string(),
relative_path,
size_bytes,
media_type,
}
});
Ok(samples.collect())
}
}
fn collect_media_files(dir: &Path, out: &mut Vec<PathBuf>) -> AppResult<()> {
for entry in fs::read_dir(dir)? {
let entry = entry?;
let path = entry.path();
let meta = entry.metadata()?;
if meta.is_dir() {
collect_media_files(&path, out)?;
} else if meta.is_file() && is_media_file(&path) {
out.push(path);
}
}
Ok(())
}
fn is_media_file(path: &Path) -> bool {
match path.extension().and_then(|e| e.to_str()).map(|e| e.to_lowercase()) {
Some(ext) => matches!(
ext.as_str(),
"jpg" | "jpeg" | "png" | "gif" | "webp" | "bmp" | "mp4" | "webm" | "mkv" | "mov" | "avi"
),
None => false,
}
}
fn media_type_for(path: &Path) -> String {
match path.extension().and_then(|e| e.to_str()).map(|e| e.to_lowercase()) {
Some(ext) if matches!(ext.as_str(), "jpg" | "jpeg" | "png" | "gif" | "webp" | "bmp") => {
"image".to_string()
}
Some(ext) if matches!(ext.as_str(), "mp4" | "webm" | "mkv" | "mov" | "avi") => {
"video".to_string()
}
_ => "other".to_string(),
}
}

View File

@@ -0,0 +1,85 @@
use std::fs::{self, File};
use std::io::{BufRead, BufReader, Write};
use std::path::{Path, PathBuf};
use serde::{Deserialize, Serialize};
use crate::error::{AppError, AppResult};
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct DownloadListEntry {
pub raw_line: String,
pub normalized_value: String,
pub matched: bool,
}
pub fn load_entries(path: &Path) -> AppResult<Vec<DownloadListEntry>> {
let file = File::open(path)?;
let reader = BufReader::new(file);
let mut entries = Vec::new();
for line in reader.lines() {
let raw = line?;
let normalized = normalize_value(&raw);
if normalized.is_empty() {
continue;
}
entries.push(DownloadListEntry {
raw_line: raw,
normalized_value: normalized,
matched: false,
});
}
Ok(entries)
}
pub fn match_entries(entries: &mut [DownloadListEntry], targets: &[String]) -> Vec<DownloadListEntry> {
let normalized_targets: Vec<String> = targets.iter().map(|t| normalize_value(t)).collect();
let mut matched = Vec::new();
for entry in entries.iter_mut() {
if normalized_targets.iter().any(|t| t == &entry.normalized_value) {
entry.matched = true;
matched.push(entry.clone());
}
}
matched
}
pub fn preview_removals(entries: &[DownloadListEntry]) -> (Vec<String>, Vec<String>) {
let mut remaining = Vec::new();
let mut removed = Vec::new();
for entry in entries {
if entry.matched {
removed.push(entry.raw_line.clone());
} else {
remaining.push(entry.raw_line.clone());
}
}
(remaining, removed)
}
pub fn apply_removals_atomic(path: &Path, remaining_lines: &[String]) -> AppResult<()> {
let temp_path = temp_path_for(path)?;
{
let mut file = File::create(&temp_path)?;
for line in remaining_lines {
writeln!(file, "{line}")?;
}
}
fs::rename(temp_path, path)?;
Ok(())
}
pub fn normalize_value(value: &str) -> String {
value.trim().to_lowercase()
}
fn temp_path_for(path: &Path) -> AppResult<PathBuf> {
let parent = path
.parent()
.ok_or_else(|| AppError::InvalidConfig("list file has no parent".to_string()))?;
let file_name = path
.file_name()
.and_then(|n| n.to_str())
.ok_or_else(|| AppError::InvalidConfig("list file has invalid name".to_string()))?;
Ok(parent.join(format!("{file_name}.tmp")))
}

View File

@@ -0,0 +1,10 @@
pub mod audit_log;
pub mod collage_sampler;
pub mod list_file;
pub mod ops;
pub mod ops_lock;
pub mod path_guard;
pub mod preview_action;
pub mod read_only;
pub mod state_store;
pub mod untagged_queue;

130
backend/src/services/ops.rs Normal file
View File

@@ -0,0 +1,130 @@
use std::fs;
use std::path::{Path, PathBuf};
use uuid::Uuid;
use crate::config::Config;
use crate::error::{AppError, AppResult};
use crate::services::path_guard::PathGuard;
#[derive(Clone)]
pub struct Ops {
path_guard: PathGuard,
whitelisted_root: PathBuf,
kept_root: PathBuf,
trash_root: PathBuf,
hard_delete_enabled: bool,
}
impl Ops {
pub fn from_config(config: &Config, path_guard: PathGuard) -> AppResult<Self> {
config.validate()?;
Ok(Self {
path_guard,
whitelisted_root: config.whitelisted_root.clone(),
kept_root: config.kept_root.clone(),
trash_root: config.trash_root.clone(),
hard_delete_enabled: config.hard_delete_enabled,
})
}
pub fn move_dir(&self, from: &Path, to: &Path) -> AppResult<()> {
self.path_guard.ensure_within_roots(from)?;
self.path_guard.ensure_within_roots(to)?;
self.ensure_not_symlink(from)?;
fs::rename(from, to)?;
Ok(())
}
pub fn keep_directory(&self, path: &Path) -> AppResult<PathBuf> {
self.path_guard.ensure_within_roots(path)?;
self.ensure_not_symlink(path)?;
let name = path
.file_name()
.and_then(|n| n.to_str())
.ok_or_else(|| AppError::InvalidConfig("invalid path".to_string()))?;
let destination = self.kept_root.join(name);
self.move_dir(path, &destination)?;
Ok(destination)
}
pub fn stage_delete_dir(&self, path: &Path) -> AppResult<PathBuf> {
self.path_guard.ensure_within_roots(path)?;
self.ensure_not_whitelisted(path)?;
self.ensure_not_symlink(path)?;
let staged_path = self.staged_path(path)?;
fs::rename(path, &staged_path)?;
Ok(staged_path)
}
pub fn stage_delete_file(&self, path: &Path) -> AppResult<PathBuf> {
self.path_guard.ensure_within_roots(path)?;
self.ensure_not_symlink(path)?;
let staged_path = self.staged_path(path)?;
fs::rename(path, &staged_path)?;
Ok(staged_path)
}
pub fn hard_delete_dir(&self, path: &Path, confirmed: bool) -> AppResult<()> {
self.path_guard.ensure_within_roots(path)?;
self.ensure_not_whitelisted(path)?;
self.ensure_not_symlink(path)?;
self.ensure_hard_delete_allowed(confirmed)?;
fs::remove_dir_all(path)?;
Ok(())
}
pub fn hard_delete_file(&self, path: &Path, confirmed: bool) -> AppResult<()> {
self.path_guard.ensure_within_roots(path)?;
self.ensure_not_symlink(path)?;
self.ensure_hard_delete_allowed(confirmed)?;
fs::remove_file(path)?;
Ok(())
}
pub fn confirm_delete_directory(&self, path: &Path, hard_delete: bool, confirmed: bool) -> AppResult<Option<PathBuf>> {
if hard_delete {
self.hard_delete_dir(path, confirmed)?;
Ok(None)
} else {
let staged = self.stage_delete_dir(path)?;
Ok(Some(staged))
}
}
fn ensure_not_whitelisted(&self, path: &Path) -> AppResult<()> {
if path.starts_with(&self.whitelisted_root) {
return Err(AppError::WhitelistProtected(path.display().to_string()));
}
Ok(())
}
fn ensure_hard_delete_allowed(&self, confirmed: bool) -> AppResult<()> {
if !self.hard_delete_enabled || !confirmed {
return Err(AppError::InvalidConfig(
"hard delete disabled or unconfirmed".to_string(),
));
}
Ok(())
}
fn ensure_not_symlink(&self, path: &Path) -> AppResult<()> {
let metadata = fs::symlink_metadata(path)?;
if metadata.file_type().is_symlink() {
return Err(AppError::PathViolation(format!(
"symlink not allowed: {}",
path.display()
)));
}
Ok(())
}
fn staged_path(&self, path: &Path) -> AppResult<PathBuf> {
let name = path
.file_name()
.and_then(|n| n.to_str())
.ok_or_else(|| AppError::InvalidConfig("invalid path".to_string()))?;
let suffix = Uuid::new_v4();
Ok(self.trash_root.join(format!("{name}.{suffix}.staged")))
}
}

View File

@@ -0,0 +1,18 @@
use std::sync::Arc;
use tokio::sync::{Mutex, MutexGuard};
#[derive(Clone, Default)]
pub struct OpsLock {
inner: Arc<Mutex<()>>,
}
impl OpsLock {
pub fn new() -> Self {
Self::default()
}
pub async fn acquire(&self) -> MutexGuard<'_, ()> {
self.inner.lock().await
}
}

View File

@@ -0,0 +1,48 @@
use std::path::{Path, PathBuf};
use crate::config::Config;
use crate::error::{AppError, AppResult};
#[derive(Debug, Clone)]
pub struct PathGuard {
roots: Vec<PathBuf>,
}
impl PathGuard {
pub fn from_config(config: &Config) -> AppResult<Self> {
config.validate()?;
Ok(Self {
roots: vec![
config.untagged_root.clone(),
config.whitelisted_root.clone(),
config.kept_root.clone(),
config.trash_root.clone(),
],
})
}
pub fn ensure_within_roots(&self, path: &Path) -> AppResult<()> {
let normalized = normalize(path);
for root in &self.roots {
let root_norm = normalize(root);
if normalized.starts_with(&root_norm) {
return Ok(());
}
}
Err(AppError::PathViolation(path.display().to_string()))
}
}
fn normalize(path: &Path) -> PathBuf {
let mut out = PathBuf::new();
for component in path.components() {
match component {
std::path::Component::CurDir => {}
std::path::Component::ParentDir => {
out.pop();
}
_ => out.push(component.as_os_str()),
}
}
out
}

View File

@@ -0,0 +1,83 @@
use std::collections::HashMap;
use std::sync::{Arc, Mutex};
use chrono::{DateTime, Duration, Utc};
use serde::{Deserialize, Serialize};
use uuid::Uuid;
use crate::error::{AppError, AppResult};
#[derive(Debug, Clone, Serialize, Deserialize)]
pub enum PreviewActionType {
DirectoryDelete,
FileDelete,
}
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct PreviewAction {
pub id: Uuid,
pub action_type: PreviewActionType,
pub target_paths: Vec<String>,
pub list_file_changes_preview: Vec<String>,
pub created_at: DateTime<Utc>,
pub expires_at: DateTime<Utc>,
}
impl PreviewAction {
pub fn new(action_type: PreviewActionType, target_paths: Vec<String>, list_file_changes: Vec<String>) -> Self {
let created_at = Utc::now();
let expires_at = created_at + Duration::minutes(15);
Self {
id: Uuid::new_v4(),
action_type,
target_paths,
list_file_changes_preview: list_file_changes,
created_at,
expires_at,
}
}
pub fn is_expired(&self) -> bool {
Utc::now() > self.expires_at
}
}
#[derive(Clone, Default)]
pub struct PreviewActionStore {
inner: Arc<Mutex<HashMap<Uuid, PreviewAction>>>,
}
impl PreviewActionStore {
pub fn new() -> Self {
Self::default()
}
pub fn create(&self, action: PreviewAction) -> PreviewAction {
let mut guard = self.inner.lock().expect("preview store lock");
guard.insert(action.id, action.clone());
action
}
pub fn get(&self, id: Uuid) -> AppResult<PreviewAction> {
let guard = self.inner.lock().expect("preview store lock");
let action = guard
.get(&id)
.cloned()
.ok_or_else(|| AppError::InvalidConfig("preview action not found".to_string()))?;
if action.is_expired() {
return Err(AppError::InvalidConfig("preview action expired".to_string()));
}
Ok(action)
}
pub fn confirm(&self, id: Uuid) -> AppResult<PreviewAction> {
let mut guard = self.inner.lock().expect("preview store lock");
let action = guard
.remove(&id)
.ok_or_else(|| AppError::InvalidConfig("preview action not found".to_string()))?;
if action.is_expired() {
return Err(AppError::InvalidConfig("preview action expired".to_string()));
}
Ok(action)
}
}

View File

@@ -0,0 +1,27 @@
use crate::config::Config;
use crate::error::{AppError, AppResult};
#[derive(Debug, Clone)]
pub struct ReadOnlyGuard {
read_only: bool,
}
impl ReadOnlyGuard {
pub fn new(config: &Config) -> Self {
Self {
read_only: config.read_only_mode,
}
}
pub fn ensure_writable(&self) -> AppResult<()> {
if self.read_only {
Err(AppError::ReadOnly)
} else {
Ok(())
}
}
pub fn ensure_writable_for_operation(&self, _operation: &str) -> AppResult<()> {
self.ensure_writable()
}
}

View File

@@ -0,0 +1,23 @@
use sqlx::sqlite::{SqliteConnectOptions, SqlitePool};
use std::str::FromStr;
use crate::config::Config;
use crate::error::AppResult;
#[derive(Clone)]
pub struct StateStore {
pool: SqlitePool,
}
impl StateStore {
pub async fn connect(config: &Config) -> AppResult<Self> {
let options = SqliteConnectOptions::from_str(&config.state_db_path.to_string_lossy())?
.create_if_missing(true);
let pool = SqlitePool::connect_with(options).await?;
Ok(Self { pool })
}
pub fn pool(&self) -> &SqlitePool {
&self.pool
}
}

View File

@@ -0,0 +1,97 @@
use std::fs;
use std::path::{Path, PathBuf};
use serde::{Deserialize, Serialize};
use crate::config::Config;
use crate::error::{AppError, AppResult};
#[derive(Debug, Clone, Serialize, Deserialize)]
pub struct UntaggedDirectory {
pub id: String,
pub name: String,
pub absolute_path: PathBuf,
pub total_size_bytes: u64,
pub file_count: u64,
}
#[derive(Clone)]
pub struct UntaggedQueue {
root: PathBuf,
}
impl UntaggedQueue {
pub fn new(config: &Config) -> AppResult<Self> {
config.validate()?;
Ok(Self {
root: config.untagged_root.clone(),
})
}
pub fn next_directory(&self) -> AppResult<Option<UntaggedDirectory>> {
let mut dirs: Vec<PathBuf> = fs::read_dir(&self.root)?
.filter_map(|entry| entry.ok())
.map(|entry| entry.path())
.filter(|path| path.is_dir())
.collect();
dirs.sort();
let path = match dirs.first() {
Some(path) => path.to_path_buf(),
None => return Ok(None),
};
let name = path
.file_name()
.and_then(|n| n.to_str())
.unwrap_or_default()
.to_string();
let id = relative_id(&self.root, &path)?;
let (total_size_bytes, file_count) = dir_stats(&path)?;
Ok(Some(UntaggedDirectory {
id,
name,
absolute_path: path,
total_size_bytes,
file_count,
}))
}
pub fn resolve_directory(&self, id: &str) -> AppResult<PathBuf> {
ensure_safe_id(id)?;
Ok(self.root.join(id))
}
}
fn dir_stats(path: &Path) -> AppResult<(u64, u64)> {
let mut total_size = 0u64;
let mut file_count = 0u64;
for entry in fs::read_dir(path)? {
let entry = entry?;
let meta = entry.metadata()?;
if meta.is_dir() {
let (size, count) = dir_stats(&entry.path())?;
total_size += size;
file_count += count;
} else if meta.is_file() {
total_size += meta.len();
file_count += 1;
}
}
Ok((total_size, file_count))
}
fn relative_id(root: &Path, path: &Path) -> AppResult<String> {
let rel = path
.strip_prefix(root)
.map_err(|_| AppError::InvalidConfig("path outside untagged root".to_string()))?;
Ok(rel.to_string_lossy().to_string())
}
fn ensure_safe_id(id: &str) -> AppResult<()> {
let path = Path::new(id);
for component in path.components() {
if matches!(component, std::path::Component::ParentDir) {
return Err(AppError::InvalidConfig("invalid directory id".to_string()));
}
}
Ok(())
}

42
backend/src/state.rs Normal file
View File

@@ -0,0 +1,42 @@
use std::sync::Arc;
use crate::config::Config;
use crate::error::AppResult;
use crate::services::{
audit_log::AuditLog, collage_sampler::CollageSampler, ops::Ops, path_guard::PathGuard,
preview_action::PreviewActionStore, read_only::ReadOnlyGuard, untagged_queue::UntaggedQueue,
};
#[derive(Clone)]
pub struct AppState {
pub config: Arc<Config>,
pub path_guard: PathGuard,
pub ops: Ops,
pub read_only: ReadOnlyGuard,
pub audit_log: AuditLog,
pub preview_store: PreviewActionStore,
pub untagged_queue: UntaggedQueue,
pub collage_sampler: CollageSampler,
}
impl AppState {
pub fn new(config: Config) -> AppResult<Self> {
let path_guard = PathGuard::from_config(&config)?;
let ops = Ops::from_config(&config, path_guard.clone())?;
let read_only = ReadOnlyGuard::new(&config);
let audit_log = AuditLog::new(config.audit_log_path.clone());
let preview_store = PreviewActionStore::new();
let untagged_queue = UntaggedQueue::new(&config)?;
let collage_sampler = CollageSampler::default();
Ok(Self {
config: Arc::new(config),
path_guard,
ops,
read_only,
audit_log,
preview_store,
untagged_queue,
collage_sampler,
})
}
}

4
frontend/.prettierrc Normal file
View File

@@ -0,0 +1,4 @@
{
"singleQuote": true,
"semi": true
}

21
frontend/package.json Normal file
View File

@@ -0,0 +1,21 @@
{
"name": "archive-curator-frontend",
"version": "0.1.0",
"private": true,
"type": "module",
"scripts": {
"dev": "vite dev",
"build": "vite build",
"preview": "vite preview",
"lint": "eslint .",
"format": "prettier --write ."
},
"devDependencies": {
"@sveltejs/kit": "^2.0.0",
"@sveltejs/vite-plugin-svelte": "^3.0.0",
"eslint": "^9.0.0",
"prettier": "^3.0.0",
"svelte": "^5.0.0",
"vite": "^5.0.0"
}
}

View File

@@ -0,0 +1,55 @@
<script lang="ts">
export let matches: string[] = [];
export let selected: string[] = [];
function toggle(match: string) {
if (selected.includes(match)) {
selected = selected.filter((value) => value !== match);
} else {
selected = [...selected, match];
}
}
</script>
<div class="matches">
<h3>List-file matches</h3>
{#if matches.length === 0}
<p>No matches detected.</p>
{:else}
<ul>
{#each matches as match}
<li>
<label>
<input
type="checkbox"
checked={selected.includes(match)}
on:change={() => toggle(match)}
/>
<span>{match}</span>
</label>
</li>
{/each}
</ul>
{/if}
</div>
<style>
.matches {
border: 1px solid #c2b8a3;
padding: 1rem;
border-radius: 12px;
background: #f7f2e9;
}
ul {
list-style: none;
padding: 0;
margin: 0.5rem 0 0;
display: grid;
gap: 0.5rem;
}
label {
display: flex;
gap: 0.5rem;
align-items: center;
}
</style>

View File

@@ -0,0 +1,41 @@
<script lang="ts">
export let onResample: () => void;
export let onKeep: () => void;
export let onPreviewDelete: () => void;
export let onConfirmDelete: () => void;
export let confirmEnabled = false;
export let busy = false;
</script>
<div class="controls">
<button on:click={onResample} disabled={busy}>Resample</button>
<button on:click={onKeep} disabled={busy}>Keep</button>
<button on:click={onPreviewDelete} disabled={busy}>Preview delete</button>
<button class="danger" on:click={onConfirmDelete} disabled={!confirmEnabled || busy}>
Confirm delete
</button>
</div>
<style>
.controls {
display: flex;
flex-wrap: wrap;
gap: 0.75rem;
}
button {
padding: 0.6rem 1rem;
border-radius: 999px;
border: 1px solid #1e1e1e;
background: #f1d6b8;
font-weight: 600;
}
button.danger {
background: #d96c4f;
color: #fff;
border-color: #7f3422;
}
button:disabled {
opacity: 0.6;
cursor: not-allowed;
}
</style>

View File

@@ -0,0 +1,210 @@
<script lang="ts">
import { onMount } from 'svelte';
import ListFileMatches from '../components/list-file-matches.svelte';
import UntaggedControls from '../components/untagged-controls.svelte';
import {
confirmDelete,
fetchNextUntagged,
previewDelete,
resampleCollage,
keepDirectory,
type DeletePreview,
type UntaggedCollage,
} from '../services/untagged_api';
let collage: UntaggedCollage | null = null;
let preview: DeletePreview | null = null;
let selectedMatches: string[] = [];
let statusMessage = '';
let busy = false;
async function loadNext() {
busy = true;
statusMessage = '';
preview = null;
selectedMatches = [];
try {
collage = await fetchNextUntagged();
} catch (err) {
statusMessage = 'No untagged directories available.';
} finally {
busy = false;
}
}
async function handleResample() {
if (!collage) return;
busy = true;
try {
collage = await resampleCollage(collage.directory_id);
} finally {
busy = false;
}
}
async function handleKeep() {
if (!collage) return;
busy = true;
statusMessage = '';
try {
await keepDirectory(collage.directory_id);
statusMessage = 'Directory moved to kept.';
await loadNext();
} catch (err) {
statusMessage = 'Keep failed.';
} finally {
busy = false;
}
}
async function handlePreviewDelete() {
if (!collage) return;
busy = true;
statusMessage = '';
try {
preview = await previewDelete(collage.directory_id);
selectedMatches = preview.list_file_changes_preview;
} catch (err) {
statusMessage = 'Preview failed.';
} finally {
busy = false;
}
}
async function handleConfirmDelete() {
if (!collage || !preview) return;
busy = true;
statusMessage = '';
try {
await confirmDelete(collage.directory_id, {
preview_id: preview.preview_id,
remove_from_list_file: selectedMatches.length > 0,
selected_matches: selectedMatches,
});
statusMessage = 'Delete staged.';
await loadNext();
} catch (err) {
statusMessage = 'Delete failed.';
} finally {
busy = false;
}
}
onMount(loadNext);
</script>
<section class="page">
<header>
<h1>Untagged Collage Review</h1>
<p>Curate directories quickly, with staged deletes and list-file previews.</p>
</header>
{#if collage}
<div class="summary">
<h2>{collage.directory_name}</h2>
<div class="meta">
<span>{collage.file_count} files</span>
<span>{Math.round(collage.total_size_bytes / (1024 * 1024))} MB</span>
</div>
</div>
<div class="grid">
{#each collage.samples as item}
<div class="tile">
<div class="badge">{item.media_type}</div>
<div class="path">{item.relative_path}</div>
</div>
{/each}
</div>
<UntaggedControls
{busy}
onResample={handleResample}
onKeep={handleKeep}
onPreviewDelete={handlePreviewDelete}
onConfirmDelete={handleConfirmDelete}
confirmEnabled={Boolean(preview)}
/>
{#if preview}
<ListFileMatches
matches={preview.list_file_changes_preview}
bind:selected={selectedMatches}
/>
{/if}
{:else}
<p class="empty">{statusMessage}</p>
{/if}
{#if statusMessage && collage}
<p class="status">{statusMessage}</p>
{/if}
</section>
<style>
:global(body) {
font-family: 'Space Grotesk', 'Fira Sans', sans-serif;
margin: 0;
background: radial-gradient(circle at top left, #f8efe1, #f4dfc8 55%, #e6c39b 100%);
color: #1b130b;
}
.page {
padding: 2rem;
max-width: 1100px;
margin: 0 auto;
}
header h1 {
font-size: 2.4rem;
margin-bottom: 0.2rem;
}
header p {
margin-top: 0;
color: #5b4634;
}
.summary {
margin: 1.5rem 0;
display: flex;
justify-content: space-between;
align-items: baseline;
border-bottom: 1px solid #bba486;
padding-bottom: 0.5rem;
}
.meta {
display: flex;
gap: 1rem;
font-weight: 600;
}
.grid {
display: grid;
grid-template-columns: repeat(auto-fill, minmax(160px, 1fr));
gap: 1rem;
margin-bottom: 1.5rem;
}
.tile {
background: #fff8ee;
border-radius: 16px;
padding: 0.8rem;
min-height: 120px;
box-shadow: 0 8px 24px rgba(73, 45, 22, 0.15);
}
.badge {
font-size: 0.75rem;
text-transform: uppercase;
letter-spacing: 0.08em;
color: #8d5b3c;
}
.path {
font-size: 0.85rem;
margin-top: 0.6rem;
color: #3b2a1d;
word-break: break-word;
}
.status {
margin-top: 1rem;
font-weight: 600;
}
.empty {
font-style: italic;
padding: 2rem 0;
}
</style>

View File

@@ -0,0 +1,70 @@
export type MediaItem = {
id: string;
user_directory_id: string;
relative_path: string;
size_bytes: number;
media_type: string;
};
export type UntaggedCollage = {
directory_id: string;
directory_name: string;
total_size_bytes: number;
file_count: number;
samples: MediaItem[];
};
export type DeletePreview = {
preview_id: string;
target_paths: string[];
list_file_changes_preview: string[];
can_proceed: boolean;
read_only_mode: boolean;
};
export type DeleteConfirm = {
preview_id: string;
remove_from_list_file: boolean;
selected_matches?: string[];
};
export type DecisionResult = {
outcome: string;
audit_entry_id: string;
};
const API_BASE = import.meta.env.VITE_API_BASE ?? '';
async function request<T>(path: string, options?: RequestInit): Promise<T> {
const response = await fetch(`${API_BASE}${path}`, {
headers: { 'Content-Type': 'application/json' },
...options,
});
if (!response.ok) {
throw new Error(`Request failed: ${response.status}`);
}
return (await response.json()) as T;
}
export function fetchNextUntagged(): Promise<UntaggedCollage> {
return request('/directories/untagged/next');
}
export function resampleCollage(directoryId: string): Promise<UntaggedCollage> {
return request(`/directories/untagged/${directoryId}/resample`, { method: 'POST' });
}
export function keepDirectory(directoryId: string): Promise<DecisionResult> {
return request(`/directories/untagged/${directoryId}/keep`, { method: 'POST' });
}
export function previewDelete(directoryId: string): Promise<DeletePreview> {
return request(`/directories/untagged/${directoryId}/preview-delete`, { method: 'POST' });
}
export function confirmDelete(directoryId: string, payload: DeleteConfirm): Promise<DecisionResult> {
return request(`/directories/untagged/${directoryId}/confirm-delete`, {
method: 'POST',
body: JSON.stringify(payload),
});
}

View File

@@ -0,0 +1,34 @@
# Specification Quality Checklist: Archive Curator
**Purpose**: Validate specification completeness and quality before proceeding to planning
**Created**: 2026-02-07
**Feature**: /home/jawz/Development/gallery-organizer-web/specs/001-archive-curator/spec.md
## Content Quality
- [x] No implementation details (languages, frameworks, APIs)
- [x] Focused on user value and business needs
- [x] Written for non-technical stakeholders
- [x] All mandatory sections completed
## Requirement Completeness
- [x] No [NEEDS CLARIFICATION] markers remain
- [x] Requirements are testable and unambiguous
- [x] Success criteria are measurable
- [x] Success criteria are technology-agnostic (no implementation details)
- [x] All acceptance scenarios are defined
- [x] Edge cases are identified
- [x] Scope is clearly bounded
- [x] Dependencies and assumptions identified
## Feature Readiness
- [x] All functional requirements have clear acceptance criteria
- [x] User scenarios cover primary flows
- [x] Feature meets measurable outcomes defined in Success Criteria
- [x] No implementation details leak into specification
## Notes
- Items marked incomplete require spec updates before `/speckit.clarify` or `/speckit.plan`

View File

@@ -0,0 +1,351 @@
openapi: 3.0.3
info:
title: Archive Curator API
version: 0.1.0
servers:
- url: http://localhost:8080
paths:
/health:
get:
summary: Service health check
responses:
'200':
description: OK
/config:
get:
summary: Get current configuration
responses:
'200':
description: Configuration
content:
application/json:
schema:
$ref: '#/components/schemas/Configuration'
put:
summary: Update configuration
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/Configuration'
responses:
'200':
description: Updated configuration
content:
application/json:
schema:
$ref: '#/components/schemas/Configuration'
'400':
description: Invalid or unsafe configuration
/directories/untagged/next:
get:
summary: Get next untagged directory for review
responses:
'200':
description: Untagged directory collage
content:
application/json:
schema:
$ref: '#/components/schemas/UntaggedCollage'
'404':
description: No untagged directories available
/directories/untagged/{directoryId}/resample:
post:
summary: Resample collage for current untagged directory
parameters:
- name: directoryId
in: path
required: true
schema:
type: string
responses:
'200':
description: Updated collage
content:
application/json:
schema:
$ref: '#/components/schemas/UntaggedCollage'
/directories/untagged/{directoryId}/keep:
post:
summary: Keep an untagged directory
parameters:
- name: directoryId
in: path
required: true
schema:
type: string
responses:
'200':
description: Keep decision recorded
content:
application/json:
schema:
$ref: '#/components/schemas/DecisionResult'
'409':
description: Read-only mode enabled
/directories/untagged/{directoryId}/preview-delete:
post:
summary: Preview deletion for an untagged directory
parameters:
- name: directoryId
in: path
required: true
schema:
type: string
responses:
'200':
description: Preview of deletion and list-file changes
content:
application/json:
schema:
$ref: '#/components/schemas/DeletePreview'
'409':
description: Read-only mode enabled
/directories/untagged/{directoryId}/confirm-delete:
post:
summary: Confirm deletion for an untagged directory
parameters:
- name: directoryId
in: path
required: true
schema:
type: string
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/DeleteConfirm'
responses:
'200':
description: Deletion executed and audited
content:
application/json:
schema:
$ref: '#/components/schemas/DecisionResult'
'409':
description: Read-only mode enabled
/triage/whitelist/next:
get:
summary: Get next whitelisted media item for triage
parameters:
- name: scope
in: query
required: false
schema:
type: string
description: all or specific user id
- name: order
in: query
required: false
schema:
type: string
enum: [random, largest]
responses:
'200':
description: Triage item
content:
application/json:
schema:
$ref: '#/components/schemas/TriageItem'
'404':
description: No items available
/triage/whitelist/{itemId}/keep:
post:
summary: Keep current triage item
parameters:
- name: itemId
in: path
required: true
schema:
type: string
responses:
'200':
description: Keep recorded
content:
application/json:
schema:
$ref: '#/components/schemas/DecisionResult'
/triage/whitelist/{itemId}/preview-delete:
post:
summary: Preview deletion for a whitelisted media item
parameters:
- name: itemId
in: path
required: true
schema:
type: string
responses:
'200':
description: Preview of file deletion
content:
application/json:
schema:
$ref: '#/components/schemas/DeletePreview'
'409':
description: Read-only mode enabled
/triage/whitelist/{itemId}/confirm-delete:
post:
summary: Confirm deletion for a whitelisted media item
parameters:
- name: itemId
in: path
required: true
schema:
type: string
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/DeleteConfirm'
responses:
'200':
description: Deletion executed and audited
content:
application/json:
schema:
$ref: '#/components/schemas/DecisionResult'
'409':
description: Read-only mode enabled
/audit/recent:
get:
summary: Get recent audit entries
responses:
'200':
description: Recent audit log
content:
application/json:
schema:
type: array
items:
$ref: '#/components/schemas/AuditEntry'
components:
schemas:
Configuration:
type: object
required:
- untagged_root
- whitelisted_root
- kept_root
- trash_root
- download_list_path
- audit_log_path
- state_db_path
- read_only_mode
- hard_delete_enabled
properties:
untagged_root:
type: string
whitelisted_root:
type: string
kept_root:
type: string
trash_root:
type: string
download_list_path:
type: string
audit_log_path:
type: string
state_db_path:
type: string
read_only_mode:
type: boolean
hard_delete_enabled:
type: boolean
excluded_patterns:
type: array
items:
type: string
UntaggedCollage:
type: object
properties:
directory_id:
type: string
directory_name:
type: string
total_size_bytes:
type: integer
file_count:
type: integer
samples:
type: array
items:
$ref: '#/components/schemas/MediaItem'
MediaItem:
type: object
properties:
id:
type: string
user_directory_id:
type: string
relative_path:
type: string
size_bytes:
type: integer
media_type:
type: string
TriageItem:
type: object
properties:
media_item:
$ref: '#/components/schemas/MediaItem'
user_name:
type: string
DeletePreview:
type: object
properties:
preview_id:
type: string
target_paths:
type: array
items:
type: string
list_file_changes_preview:
type: array
items:
type: string
can_proceed:
type: boolean
read_only_mode:
type: boolean
DeleteConfirm:
type: object
required:
- preview_id
- remove_from_list_file
properties:
preview_id:
type: string
remove_from_list_file:
type: boolean
DecisionResult:
type: object
properties:
outcome:
type: string
audit_entry_id:
type: string
AuditEntry:
type: object
properties:
id:
type: string
timestamp:
type: string
action_type:
type: string
affected_paths:
type: array
items:
type: string
list_file_changes:
type: array
items:
type: string
outcome:
type: string
preview_id:
type: string

View File

@@ -0,0 +1,92 @@
# Data Model: Archive Curator
## Entities
### UserDirectory
- **Fields**: id, name, absolute_path, relative_path, total_size_bytes, file_count,
state, created_at, updated_at
- **Validation**:
- state MUST be one of Untagged, Whitelisted, Blacklisted, Kept
- absolute_path MUST be within configured roots
- relative_path MUST be stable and derived from root + name
- **Relationships**:
- has many MediaItem
- has many DecisionRecord
### MediaItem
- **Fields**: id, user_directory_id, absolute_path, relative_path, size_bytes,
media_type, created_at
- **Validation**:
- absolute_path MUST be within parent directory
- media_type MUST be one of image/video/other
- **Relationships**:
- belongs to UserDirectory
- may appear in TriageQueue
### DirectoryState
- **Fields**: user_directory_id, state, updated_at
- **Validation**:
- state MUST be one of Untagged, Whitelisted, Blacklisted, Kept
- **Relationships**:
- one-to-one with UserDirectory
### DecisionRecord
- **Fields**: id, user_directory_id, decision_type, decision_scope, operator,
preview_id, confirmed_at, outcome
- **Validation**:
- decision_type MUST be Keep or Delete
- decision_scope MUST be Directory or File
- **Relationships**:
- belongs to UserDirectory
- references PreviewAction
### PreviewAction
- **Fields**: id, action_type, target_paths, list_file_changes_preview, created_at,
expires_at
- **Validation**:
- action_type MUST be DirectoryDelete or FileDelete
- target_paths MUST be within configured roots
- **Relationships**:
- linked to DecisionRecord on confirmation
### AuditEntry
- **Fields**: id, timestamp, action_type, affected_paths, list_file_changes,
outcome, preview_id
- **Validation**:
- action_type MUST match destructive or state-change actions
- affected_paths MUST be within configured roots
### DownloadListEntry
- **Fields**: raw_line, normalized_value, matched
- **Validation**:
- default matching is case-insensitive after trimming surrounding whitespace
### Configuration
- **Fields**: untagged_root, whitelisted_root, kept_root, trash_root,
download_list_path, audit_log_path, state_db_path, read_only_mode,
hard_delete_enabled, excluded_patterns
- **Validation**:
- roots MUST be explicit and non-overlapping
- read_only_mode MUST block all mutations
## State Transitions
### UserDirectory State
- Untagged → Kept (keep decision; move to kept root)
- Untagged → Blacklisted (delete decision, prior to deletion)
- Untagged → Whitelisted (manual curation)
- Whitelisted → Kept (explicit preserve; move to kept root)
- Blacklisted → Deleted (after confirmation and action)
### Deletion Workflow
- PreviewAction created (dry-run)
- User confirms → DecisionRecord created
- Action executed → AuditEntry appended
## Derived Views
### UntaggedCollage
- **Fields**: user_directory_id, sample_media_items[], total_size_bytes, file_count
### WhitelistTriageItem
- **Fields**: media_item_id, user_directory_id, size_bytes, media_type, relative_path

View File

@@ -0,0 +1,90 @@
# Implementation Plan: Archive Curator
**Branch**: `001-archive-curator` | **Date**: 2026-02-07 | **Spec**: /home/jawz/Development/gallery-organizer-web/specs/001-archive-curator/spec.md
**Input**: Feature specification from `/specs/001-archive-curator/spec.md`
**Note**: This template is filled in by the `/speckit.plan` command. See `.specify/templates/commands/plan.md` for the execution workflow.
## Summary
Build a safe, web-based curator for a local media archive where each subdirectory
represents a scraped user. The system provides two core workflows: untagged directory
collage review for keep/delete decisions (with list-file preview/removal) and
whitelisted media triage for single-file deletion with strict directory protection.
All destructive actions are previewed, confirmed, serialized, and audited. Delivery is
phased from a read-only viewer through deletion workflows, hardening, and NixOS module
maturity.
## Technical Context
**Language/Version**: Rust (stable toolchain)
**Primary Dependencies**: Web API framework (Axum), UI framework (SvelteKit), OpenAPI tooling
**Storage**: Local durable state store (SQLite) + append-only audit log file
**Testing**: cargo test (unit/integration), API tests (HTTP), NixOS VM tests
**Target Platform**: NixOS/Linux (local network, single-operator)
**Project Type**: Web application (backend + frontend)
**Performance Goals**: Visual review stays responsive; collage load and next-item
advance feel immediate for local storage
**Constraints**: Safety-first, offline/local-network only, strict root boundaries,
read-only mode support, serialized destructive ops
**Scale/Scope**: Large local archives with many user directories and large media files
## Constitution Check
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
- Whitelisted directories remain protected from any directory-level delete
- All destructive actions include dry-run preview, explicit confirmation, and audit log
- Read-only mode disables all mutations while allowing browsing
- Destructive operations are single-writer and never concurrent
- Operations are bounded to configured roots; destructive ops never follow symlinks
- Default deletion is two-stage (trash/staging); hard delete is explicit + confirmed
- List-file edits are previewed, optional, atomic; exact-match removal by default
- UI surfaces directory state and recent audit activity for verification
**Gate Status**: PASS (requirements and plan explicitly enforce all constraints)
**Post-Design Re-check**: PASS (data model and contracts preserve all safety gates)
## Project Structure
### Documentation (this feature)
```text
specs/001-archive-curator/
├── plan.md # This file (/speckit.plan command output)
├── research.md # Phase 0 output (/speckit.plan command)
├── data-model.md # Phase 1 output (/speckit.plan command)
├── quickstart.md # Phase 1 output (/speckit.plan command)
├── contracts/ # Phase 1 output (/speckit.plan command)
└── tasks.md # Phase 2 output (/speckit.tasks command - NOT created by /speckit.plan)
```
### Source Code (repository root)
```text
backend/
├── src/
│ ├── models/
│ ├── services/
│ └── api/
└── tests/
frontend/
├── src/
│ ├── components/
│ ├── pages/
│ └── services/
└── tests/
```
**Structure Decision**: Web application with separate backend and frontend to enable a
Rust API service and a touch-focused web UI while keeping filesystem mutations confined
to the operations layer.
## Complexity Tracking
> **Fill ONLY if Constitution Check has violations that must be justified**
| Violation | Why Needed | Simpler Alternative Rejected Because |
|-----------|------------|-------------------------------------|
| N/A | N/A | N/A |

View File

@@ -0,0 +1,42 @@
# Quickstart: Archive Curator (Phase 1)
## Purpose
Run the service in a safe, local-only configuration and confirm read-only browsing,
untagged review, and audit visibility behaviors.
## Prerequisites
- Local archive directories available (untagged, whitelisted, kept, trash)
- Download list file path available
- Nix installed and flake support enabled
## Setup Steps
1. Configure required paths and safety flags:
- untagged root
- whitelisted root
- kept root
- trash/staging root
- download list file path
- audit log path
- state database path
- read-only mode
- hard delete enabled (default off)
2. Start the service via NixOS module or local run target.
3. Open the web UI on the configured bind address and port.
## Smoke Checks
- Read-only mode blocks all mutations while allowing browsing.
- Untagged directory collage loads with directory name, size, file count, and samples.
- Resample changes the collage without leaving the directory.
- Preview/confirm flow is required before any destructive action.
- Audit view shows recent actions with paths and timestamps.
## Safety Verification
- Whitelisted directories cannot be deleted by directory-level actions.
- Deletions are staged to trash by default.
- List-file edits are previewed, optional, and case-insensitive after trimming
surrounding whitespace.

View File

@@ -0,0 +1,35 @@
# Phase 0 Research: Archive Curator
## Decision: Backend Web API Framework
- **Decision**: Axum
- **Rationale**: Rust-native, async-first, strong ecosystem support, and clean layering
between validation and business logic.
- **Alternatives considered**: Actix Web (fast, but heavier abstraction), Warp
(composable but less active).
## Decision: Frontend UI Framework
- **Decision**: SvelteKit
- **Rationale**: Small bundle size, fast UI updates, and well-suited for touch-first,
swipe-capable interfaces.
- **Alternatives considered**: React (larger bundle, more boilerplate), Vue (good fit
but less direct for swipe-first interactions without extra libraries).
## Decision: State Storage
- **Decision**: SQLite for directory state + append-only audit log file
- **Rationale**: Durable local storage with simple deployment, supports atomic updates
and queryable state history; audit log remains append-only and retained indefinitely.
- **Alternatives considered**: Flat JSON state (simpler but weaker concurrency and
integrity guarantees), embedded key-value store (less standard tooling).
## Decision: API Contract Style
- **Decision**: REST with explicit preview/confirm endpoints
- **Rationale**: Clear separation between preview and confirmation phases enforces
safety requirements; easy to document and test.
- **Alternatives considered**: GraphQL (flexible but less explicit for multi-step
destructive actions).
## Decision: Testing Strategy
- **Decision**: Rust unit/integration tests + HTTP API tests + NixOS VM tests
- **Rationale**: Aligns with binding testing plan and ensures safety rules are enforced
at multiple layers, including system service behavior.
- **Alternatives considered**: Unit tests only (insufficient for safety guarantees).

View File

@@ -0,0 +1,204 @@
# Feature Specification: Archive Curator
**Feature Branch**: `001-archive-curator`
**Created**: 2026-02-07
**Status**: Draft
**Input**: User description: "Build a web-based curator for a large local media archive generated by automated scraping, where each subdirectory represents one scraped user. The goal is to give a human fast, visual, and safe control over what stays, what goes, and what should never be downloaded again. WHY THIS EXISTS The archive has grown beyond what can be managed with a file manager. Many users are junk, duplicates, or low-value, while others must be preserved permanently. Deleting blindly is dangerous, and deleting a user without stopping future downloads causes the archive to refill itself. This system exists to make irreversible decisions deliberate, visible, and traceable. CORE IDEAS • A “user directory” is the atomic unit of decision. • Each user directory must have an explicit state: Untagged: not yet reviewed. Whitelisted: the user is valuable and protected. Blacklisted: the user is safe to delete. Kept: explicitly preserved and removed from the decision pool. • The current state must always be visible and must be the single source of truth. NON-NEGOTIABLE SAFETY • Whitelisted user directories must never be deletable by directory-level actions. • All destructive actions must: 1) show a preview of what will change, 2) require explicit human confirmation, 3) leave a permanent audit record. • The system must refuse to act outside explicitly configured root paths. • A global read-only mode must exist where nothing can be modified. • Destructive operations must be serialized and never run concurrently. • Symlinks must never be followed during deletion. REQUIRED WORKFLOWS MODE 1 — WHITELIST MEDIA TRIAGE Purpose: reclaim disk space without risking loss of important users. Behavior: • Show one image or video at a time belonging only to whitelisted users. • Allow selection scope: all whitelisted users (random), a specific whitelisted user. • Provide two viewing orders: random, largest files first. • Actions per item: Keep (no change), Delete this file (confirmation required). • Always display: owning user, file size, media type, relative path. • Automatically advance after each decision. Success condition: I can rapidly delete low-value files while being confident I cannot delete an entire whitelisted user. MODE 2 — UNTAGGED DIRECTORY COLLAGE REVIEW Purpose: decide whether an entire user is worth keeping. Behavior: • Select one untagged user directory at a time. • Display: directory name, total size, file count, a collage of randomly sampled files from that directory. • Provide a “resample” action to view a different random subset. • Decisions: Keep: * mark or move the directory as preserved, * remove it from the untagged pool, * record the decision. Delete: * before deletion, attempt to find the user in a plain-text download list file, * preview whether the user exists and what lines would be removed, * allow choosing whether to remove the user from the list file, * require a high-friction confirmation before deleting the directory, * perform selected actions and record them. DOWNLOAD LIST FILE HANDLING • The download list file controls future scraping and must be treated as critical data. • User removal must be conservative and explicit. • Default behavior is exact-match removal only. • All edits must be previewed and performed atomically. • If no entry is found, this must be clearly shown and handled safely. AUDITING AND TRACEABILITY • Every mutation must create an append-only audit entry. • Audit entries must capture: what action occurred, what paths were affected, whether the download list file was edited, when it happened. • Audit history must be viewable for verification. CONFIGURATION BEHAVIOR The system must allow configuration of: • untagged pool root, • whitelisted root, • kept root, • trash or staging area for deletions, • download list file path, • read-only mode, • whether deletions are staged or permanent by default. OUT OF SCOPE • Automatic or unattended deletion. • Machine-generated keep/delete decisions. • Silent bulk operations. • Integration with the scraper beyond editing the download list file. DONE MEANS • I can safely delete individual files from whitelisted users. • I can quickly decide keep or delete for untagged users using visual samples. • Deleting a user can also stop future downloads in a controlled, previewed way. • Whitelist protection is enforced and cannot be bypassed. • Every destructive action is previewed, confirmed, and auditable."
## Clarifications
### Session 2026-02-07
- Q: What is the default list-file matching rule? → A: Case-insensitive match after trimming surrounding whitespace.
- Q: What is the access control model? → A: No authentication; access limited to trusted local network.
- Q: What is the audit retention policy? → A: Retain audit history indefinitely.
- Q: What does “Kept” mean operationally? → A: Move to the kept root.
- Q: How should multiple list-file matches be handled? → A: List all matches and allow selective removal.
## User Scenarios & Testing *(mandatory)*
### User Story 1 - Untagged Directory Decisions (Priority: P1)
As a curator, I review one untagged user at a time using a visual collage so I can
quickly decide to keep or delete the entire user directory.
**Why this priority**: This is the primary workflow to shrink the archive and make
high-impact decisions safely.
**Independent Test**: Can be fully tested by selecting an untagged directory, making
keep and delete decisions, and verifying that the directory leaves the untagged pool
with an audit entry and correct list-file handling.
**Acceptance Scenarios**:
1. **Given** an untagged directory with media, **When** I open it, **Then** I see the
directory name, total size, file count, and a collage of random samples.
2. **Given** an untagged directory, **When** I resample, **Then** the collage changes
while staying within the same directory.
3. **Given** an untagged directory, **When** I choose Keep and confirm, **Then** the
directory is marked or moved as preserved, removed from the untagged pool, and the
decision is recorded.
4. **Given** an untagged directory, **When** I choose Delete, **Then** I see a preview
of the directory deletion and any matching download list entries before I can
confirm.
5. **Given** a delete decision, **When** I decline list-file removal, **Then** the
directory deletion proceeds without modifying the list file and the outcome is
audited.
---
### User Story 2 - Whitelisted Media Triage (Priority: P2)
As a curator, I triage individual files inside whitelisted user directories so I can
reclaim disk space without risking loss of the entire user.
**Why this priority**: It enables safe space recovery while preserving valuable users.
**Independent Test**: Can be tested by selecting whitelisted users, viewing items in
random and size-prioritized order, deleting individual files with confirmation, and
verifying the parent directory remains intact.
**Acceptance Scenarios**:
1. **Given** whitelisted users exist, **When** I start triage, **Then** I see one item
at a time with owning user, file size, media type, and relative path.
2. **Given** triage mode, **When** I switch ordering, **Then** items are presented in
random order or largest-first order as chosen.
3. **Given** a whitelisted file, **When** I choose Delete and confirm, **Then** only
the file is removed and the parent directory remains protected.
4. **Given** triage mode, **When** I choose Keep, **Then** no mutation occurs and the
next item is shown automatically.
---
### User Story 3 - Audit Visibility and Safe Configuration (Priority: P3)
As a curator, I can view recent audit history and configure required paths and safety
modes so I can verify past actions and keep operations within safe boundaries.
**Why this priority**: It provides traceability and enforces safety constraints for all
other workflows.
**Independent Test**: Can be tested by configuring roots and safety modes, attempting
out-of-bounds actions, and verifying audit history visibility.
**Acceptance Scenarios**:
1. **Given** a configured system, **When** I open the audit view, **Then** I can see
recent mutations with action, paths, list-file edits, and timestamps.
2. **Given** I enable read-only mode, **When** I attempt a destructive action, **Then**
the action is blocked and I am informed no changes were made.
3. **Given** configured root paths, **When** I attempt a destructive action outside
those roots, **Then** the system refuses the action and records the refusal.
---
### Edge Cases
- What happens when an untagged directory is empty or contains only unsupported media?
- How does the system handle missing or unreadable download list files?
- What happens if a directory or file is a symlink during a delete action?
- How does the system behave if read-only mode is enabled mid-session?
- What happens when a directory is moved or deleted outside the tool while being viewed?
### Scope & Non-Goals
**In scope**: human-driven review, safe deletion workflows, list-file preview/removal,
and audit visibility for verification.
**Out of scope**: automatic or unattended deletion, machine-generated decisions,
silent bulk operations, and integration with the scraper beyond list-file edits.
## Requirements *(mandatory)*
### Functional Requirements
- **FR-001**: System MUST treat each user directory as the atomic unit of decision.
- **FR-002**: System MUST maintain a single visible state per user directory: Untagged,
Whitelisted, Blacklisted, or Kept.
- **FR-003**: System MUST provide an untagged directory review view with a random-sample
collage, directory name, total size, and file count.
- **FR-004**: System MUST allow resampling the collage without changing the directory.
- **FR-005**: System MUST allow Keep and Delete decisions for untagged directories and
record the decision outcome.
- **FR-005a**: Keep decisions MUST move the directory to the kept root and remove it
from the untagged pool.
- **FR-006**: System MUST attempt to locate matching entries in the download list file
before deleting a directory and present a preview of potential removals.
- **FR-007**: System MUST allow the user to choose whether to remove matching list-file
entries, independently of directory deletion.
- **FR-007a**: If multiple list-file matches are found, the system MUST list all
matches and allow selective removal.
- **FR-008**: System MUST provide whitelisted media triage that shows one item at a time
with owning user, file size, media type, and relative path.
- **FR-009**: System MUST allow triage scope to be all whitelisted users or a specific
whitelisted user.
- **FR-010**: System MUST support random and largest-first ordering in triage mode.
- **FR-011**: System MUST auto-advance to the next item after Keep or Delete actions.
- **FR-012**: System MUST provide a view of recent audit history.
- **FR-013**: System MUST allow configuration of untagged root, whitelisted root, kept
root, trash/staging area, download list file path, read-only mode, and deletion
staging behavior.
- **FR-014**: System MUST clearly state when no matching download list entry is found
and proceed safely without list-file edits.
- **FR-015**: System MUST match download list entries using case-insensitive comparison
after trimming surrounding whitespace.
- **FR-016**: System MUST operate without authentication and assume access is limited
to a trusted local network.
- **FR-017**: System MUST retain audit history indefinitely.
### Safety & Data Preservation Requirements *(mandatory for destructive actions)*
- **SR-001**: System MUST provide a dry-run preview for destructive actions.
- **SR-002**: System MUST require explicit confirmation before destructive actions.
- **SR-003**: System MUST append an audit record for every mutation.
- **SR-004**: System MUST refuse to act outside configured root paths.
- **SR-005**: System MUST NOT follow symlinks for destructive actions.
- **SR-006**: System MUST provide a global read-only mode that disables mutations.
- **SR-007**: System MUST default to two-stage deletion (trash/staging) unless
explicitly configured.
- **SR-008**: System MUST serialize destructive operations and disallow concurrent
deletes.
- **SR-009**: Whitelisted directories MUST never be deletable by directory-level
actions.
- **SR-010**: List-file edits MUST be previewed and performed atomically, with
exact-match removal by default using case-insensitive match after trimming
surrounding whitespace.
### Key Entities *(include if feature involves data)*
- **User Directory**: Folder containing media for one scraped user, with a single
explicit state.
- **Directory State**: One of Untagged, Whitelisted, Blacklisted, Kept, stored as the
source of truth for decisioning.
- **Media Item**: An image or video file within a user directory.
- **Download List Entry**: A line in the download list file representing a user to be
scraped.
- **Audit Entry**: Append-only record of a mutation with action, paths, list-file
changes, and timestamp.
- **Configuration**: The set of roots, list-file path, read-only mode, and deletion
staging preference that bounds operations.
### Assumptions
- The curator is a single human operator at a time.
- The download list file is plain text with one user entry per line.
- The curated archive resides on local storage accessible to the tool.
### Dependencies
- Access to the configured root paths and download list file on local storage.
## Success Criteria *(mandatory)*
### Measurable Outcomes
- **SC-001**: 90% of untagged directory decisions (keep/delete) complete in under
60 seconds after opening the directory.
- **SC-002**: Users can complete at least 50 whitelisted file triage decisions in
10 minutes without directory-level deletion risk.
- **SC-003**: 100% of destructive actions show a preview, require confirmation, and
create a visible audit entry.
- **SC-004**: 95% of attempted list-file removals report a clear preview of the exact
lines to be removed before confirmation.
- **SC-005**: Read-only mode prevents 100% of mutation attempts while still allowing
browsing and review.

View File

@@ -0,0 +1,234 @@
---
description: "Task list template for feature implementation"
---
# Tasks: Archive Curator
**Input**: Design documents from `/specs/001-archive-curator/`
**Prerequisites**: plan.md (required), spec.md (required for user stories), research.md, data-model.md, contracts/
**Tests**: The examples below include test tasks. Tests are OPTIONAL - only include them if explicitly requested in the feature specification.
**Organization**: Tasks are grouped by user story to enable independent implementation and testing of each story.
## Format: `[ID] [P?] [Story] Description`
- **[P]**: Can run in parallel (different files, no dependencies)
- **[Story]**: Which user story this task belongs to (e.g., US1, US2, US3)
- Include exact file paths in descriptions
## Path Conventions
- **Single project**: `src/`, `tests/` at repository root
- **Web app**: `backend/src/`, `frontend/src/`
- **Mobile**: `api/src/`, `ios/src/` or `android/src/`
- Paths shown below assume single project - adjust based on plan.md structure
## Phase 1: Setup (Shared Infrastructure)
**Purpose**: Project initialization and basic structure
- [X] T001 Create backend and frontend directory structure in `backend/src/` and `frontend/src/`
- [X] T002 Initialize Rust backend crate in `backend/Cargo.toml`
- [X] T003 Initialize SvelteKit frontend in `frontend/package.json`
- [X] T004 [P] Add repository-wide formatting and lint configs in `backend/rustfmt.toml` and `frontend/.prettierrc`
- [X] T005 Add base README for local run notes in `README.md`
---
## Phase 2: Foundational (Blocking Prerequisites)
**Purpose**: Core infrastructure that MUST be complete before ANY user story can be implemented
**⚠️ CRITICAL**: No user story work can begin until this phase is complete
- [X] T006 Implement configuration model and validation in `backend/src/config.rs`
- [X] T006a Implement root non-overlap validation (fail-fast) in `backend/src/config.rs`
- [X] T007 Implement root boundary validation helpers in `backend/src/services/path_guard.rs`
- [X] T008 Implement read-only mode enforcement guard in `backend/src/services/read_only.rs`
- [X] T009 Implement state storage access layer in `backend/src/services/state_store.rs`
- [X] T010 Implement audit log append-only writer in `backend/src/services/audit_log.rs`
- [X] T011 Implement list-file parser and matcher in `backend/src/services/list_file.rs`
- [X] T012 Implement preview/confirm action model in `backend/src/services/preview_action.rs`
- [X] T013 Implement filesystem operations facade in `backend/src/services/ops.rs`
- [X] T014 Add HTTP server bootstrap and routing in `backend/src/main.rs`
- [X] T014a Enforce bind address defaults/local-network restriction in `backend/src/main.rs`
**Checkpoint**: Foundation ready - user story implementation can now begin in parallel
---
## Phase 2.5: Safety & Compliance (Mandatory for destructive operations)
**Purpose**: Enforce constitution safety guarantees before any deletion work
- [X] T015 Implement global read-only mode block in `backend/src/services/read_only.rs`
- [X] T016 Enforce root-path boundaries for all filesystem operations in `backend/src/services/path_guard.rs`
- [X] T017 Implement single-writer guard for destructive operations in `backend/src/services/ops_lock.rs`
- [X] T018 Implement dry-run preview + explicit confirmation flow in `backend/src/services/preview_action.rs`
- [X] T019 Implement two-stage deletion (trash/staging) in `backend/src/services/ops.rs`
- [X] T019a Enforce hard-delete disabled by default and require explicit config + confirmation in `backend/src/services/ops.rs`
- [X] T020 Enforce symlink-safe deletion in `backend/src/services/ops.rs`
- [X] T021 Append-only audit log for every mutation in `backend/src/services/audit_log.rs`
- [X] T022 Enforce whitelist protection for directory-level actions in `backend/src/services/ops.rs`
- [X] T023 Implement list-file edit preview + atomic write in `backend/src/services/list_file.rs`
**Checkpoint**: Safety guarantees verified - destructive workflows can now begin
---
## Phase 3: User Story 1 - Untagged Directory Decisions (Priority: P1) 🎯 MVP
**Goal**: Review untagged directories with a collage, keep or delete safely, and preview list-file changes
**Independent Test**: Can review an untagged directory, resample collage, keep (move to kept root),
preview delete with list-file matches, and confirm delete with audit entry
### Implementation for User Story 1
- [X] T024 [P] [US1] Implement untagged directory selection service in `backend/src/services/untagged_queue.rs`
- [X] T025 [P] [US1] Implement collage sampler in `backend/src/services/collage_sampler.rs`
- [X] T026 [US1] Implement keep decision (move to kept root) in `backend/src/services/ops.rs`
- [X] T027 [US1] Implement delete preview for untagged directories in `backend/src/services/preview_action.rs`
- [X] T028 [US1] Implement delete confirmation for untagged directories in `backend/src/services/ops.rs`
- [X] T029 [P] [US1] Add API endpoints for untagged review in `backend/src/api/untagged.rs`
- [X] T030 [P] [US1] Add API endpoints for untagged delete preview/confirm in `backend/src/api/untagged_delete.rs`
- [X] T030a [P] [US1] Add list-file match selection payload handling in `backend/src/api/untagged_delete.rs`
- [X] T031 [P] [US1] Create collage UI page in `frontend/src/pages/untagged-collage.svelte`
- [X] T032 [P] [US1] Create resample and decision controls in `frontend/src/components/untagged-controls.svelte`
- [X] T032a [P] [US1] Add list-file match selection UI in `frontend/src/components/list-file-matches.svelte`
- [X] T033 [US1] Wire untagged review API client in `frontend/src/services/untagged_api.ts`
**Checkpoint**: User Story 1 fully functional and independently testable
---
## Phase 4: User Story 2 - Whitelisted Media Triage (Priority: P2)
**Goal**: Review whitelisted media items one at a time with safe per-file deletion
**Independent Test**: Can scope triage to all or one user, order by random or largest,
keep or delete items with confirmation, and auto-advance
### Implementation for User Story 2
- [ ] T034 [P] [US2] Implement whitelist triage queue in `backend/src/services/triage_queue.rs`
- [ ] T035 [P] [US2] Implement ordering strategies in `backend/src/services/triage_order.rs`
- [ ] T036 [US2] Implement per-file delete preview in `backend/src/services/preview_action.rs`
- [ ] T037 [US2] Implement per-file delete confirmation in `backend/src/services/ops.rs`
- [ ] T038 [P] [US2] Add API endpoints for triage in `backend/src/api/triage.rs`
- [ ] T039 [P] [US2] Create triage UI page in `frontend/src/pages/whitelist-triage.svelte`
- [ ] T040 [P] [US2] Create triage item viewer component in `frontend/src/components/triage-item.svelte`
- [ ] T041 [US2] Wire triage API client in `frontend/src/services/triage_api.ts`
**Checkpoint**: User Story 2 fully functional and independently testable
---
## Phase 5: User Story 3 - Audit Visibility and Safe Configuration (Priority: P3)
**Goal**: View audit history and manage required configuration safely
**Independent Test**: Can view recent audit entries and update configuration with validation,
with read-only mode enforced
### Implementation for User Story 3
- [ ] T042 [P] [US3] Implement audit history query in `backend/src/services/audit_log.rs`
- [ ] T043 [P] [US3] Implement configuration CRUD in `backend/src/api/config.rs`
- [ ] T044 [P] [US3] Add audit history endpoint in `backend/src/api/audit.rs`
- [ ] T045 [P] [US3] Create audit history UI page in `frontend/src/pages/audit-history.svelte`
- [ ] T046 [P] [US3] Create configuration UI page in `frontend/src/pages/configuration.svelte`
- [ ] T047 [US3] Wire audit/config API clients in `frontend/src/services/admin_api.ts`
**Checkpoint**: User Story 3 fully functional and independently testable
---
## Phase 6: Polish & Cross-Cutting Concerns
**Purpose**: Improvements that affect multiple user stories
- [ ] T048 [P] Add touch-friendly styling and swipe cues in `frontend/src/components/touch.css`
- [ ] T049 Add NixOS module skeleton in `nix/module.nix`
- [ ] T050 Add module options and validation in `nix/module.nix`
- [ ] T051 Add systemd service wiring in `nix/module.nix`
- [ ] T052 Add documentation for safety rules in `docs/safety.md`
- [ ] T053 Add documentation for list-file identity rules in `docs/list-file.md`
- [ ] T054 Add documentation for testing expectations in `docs/testing.md`
- [ ] T055 Add phase testing guide for Phase 0-2 in `docs/testing/phase-guides.md`
---
## Dependencies & Execution Order
### Phase Dependencies
- **Setup (Phase 1)**: No dependencies - can start immediately
- **Foundational (Phase 2)**: Depends on Setup completion - BLOCKS all user stories
- **Safety & Compliance (Phase 2.5)**: Depends on Foundational completion - BLOCKS all destructive workflows
- **User Stories (Phase 3+)**: Depend on Safety & Compliance completion
- User stories can then proceed in parallel (if staffed)
- Or sequentially in priority order (P1 → P2 → P3)
- **Polish (Final Phase)**: Depends on all desired user stories being complete
### User Story Dependencies
- **User Story 1 (P1)**: Can start after Safety & Compliance (Phase 2.5)
- **User Story 2 (P2)**: Can start after Safety & Compliance (Phase 2.5)
- **User Story 3 (P3)**: Can start after Safety & Compliance (Phase 2.5)
### Within Each User Story
- Services before API endpoints
- Endpoints before UI wiring
- UI components before page integration
- Story complete before moving to next priority
### Parallel Opportunities
- All Setup tasks marked [P] can run in parallel
- All Safety & Compliance tasks marked [P] can run in parallel (within Phase 2.5)
- Within a story, tasks marked [P] can run in parallel across different files
---
## Parallel Example: User Story 1
```bash
# Launch service and UI tasks for User Story 1 together:
Task: "Implement collage sampler in backend/src/services/collage_sampler.rs"
Task: "Create collage UI page in frontend/src/pages/untagged-collage.svelte"
```
---
## Implementation Strategy
### MVP First (User Story 1 Only)
1. Complete Phase 1: Setup
2. Complete Phase 2: Foundational
3. Complete Phase 2.5: Safety & Compliance
4. Complete Phase 3: User Story 1
5. **STOP and VALIDATE**: Validate untagged review workflow end-to-end
### Incremental Delivery
1. Complete Setup + Foundational + Safety → Foundation ready
2. Add User Story 1 → Validate independently (MVP)
3. Add User Story 2 → Validate independently
4. Add User Story 3 → Validate independently
5. Add Polish tasks as needed
### Parallel Team Strategy
With multiple developers:
1. Team completes Setup + Foundational together
2. Once Safety & Compliance is done:
- Developer A: User Story 1
- Developer B: User Story 2
- Developer C: User Story 3
3. Stories complete and integrate independently