Function2Scene: 3D Indoor Scene Layout from Functional Specifications - Summary

Summary (Overview)

  • Functionality-First Framing: Proposes a new paradigm for 3D indoor scene synthesis, shifting the input from object-centric prompts (e.g., "a bedroom with a queen bed") to detailed functional specifications (e.g., "a bedroom for a couple where one partner reads late while the other sleeps early").
  • Constraint Taxonomy: Introduces a comprehensive, LLM-customizable taxonomy of 17 functional design constraints organized into four categories: Spatial (S), Ergonomic (E), Activity (A), and Environmental (N), grounded in interior design literature.
  • Iterative Check-and-Repair Pipeline: Develops a novel framework that iteratively evaluates and refines generated layouts using a tool-augmented loop combining geometric measurements, LLM-based reasoning, and VLM-based visual assessment.
  • Superior Functional Performance: Demonstrates that the proposed method significantly outperforms recent LLM-based scene synthesis baselines, with its layouts preferred in 94.3% of pairwise comparisons in a perceptual study based on real-world interior design cases.

Introduction and Theoretical Foundation

A furnished room is not merely a collection of objects but a design that supports human activities. Traditional text-driven 3D scene synthesis methods focus on generating visually plausible object arrangements from prompts that specify what furniture to place. In contrast, real interior design starts from a functional specification—a natural-language design brief describing who will use the space and what they need to do there. This work reframes the problem as generating layouts that satisfy these functional requirements.

The paper identifies a gap: while LLMs enable flexible text-conditioned synthesis, they inherit an "implicit" approach, primarily generating scenes from object-centric prompts and checking for visual/physical plausibility, not functional support. The irony is that LLMs are well-suited to the two tasks that limited classical rule-based systems: parsing open-ended functional descriptions and optimizing heterogeneous functional criteria. This work leverages LLMs to bridge this gap, combining their reasoning power with explicit, customizable design principles.

Methodology

The Function2Scene framework operates in two main stages: Initialization and Constraints-based Evaluation and Refinement.

1. Initialization

Given a raw functional description:

  1. Parsing: An LLM parses the input to extract:
    • A parsed scene description (an LLM-friendly reformulation).
    • A structured list of functional constraints customized from the taxonomy based on extracted occupant personas and activities.
  2. Room Structure Generation: An LLM generates an empty room structure (walls, doors, windows) encoded in a custom JSON-based Domain-Specific Language (DSL). This structure is visualized for user verification.
  3. Furniture Initialization: An LLM generates an initial furniture layout within the verified room. This serves as a starting point but is often functionally deficient, motivating the refinement stage.

2. Constraints-based Evaluation and Refinement

This stage performs an iterative check-and-repair loop over the generated layout.

  • Constraint Taxonomy: Constraints are organized into four categories across six priority tiers (T1-T6). Lower-tier constraints (e.g., basic spatial validity) must be satisfied before higher-tier ones (e.g., environmental comfort) are considered.
CategoryConstraintsWhat it checks
Spatial (S)S1: Geometry Validity<br>S2: Boundary & Attachment<br>S3: Spatial Relationships<br>S4: Scale & Proportion<br>S5: Visual CompositionObject containment, wall attachment, grouping logic, size proportionality, visual balance.
Ergonomic (E)E1: Circulation<br>E2: Interaction Clearance<br>E3: Reachability<br>E4: Body Fit & PostureClear pathways, door/chair clearance, user reach, anthropometric fit.
Activity (A)A1: Activity Zone<br>A2: Sightlines & Privacy<br>A3: Workflow Sequencing<br>A4: Multi-activity CompatibilityDedicated zones for tasks, unobstructed views, logical activity order, simultaneous activity support.
Environmental (N)N1: Natural Light Access<br>N2: Glare Prevention<br>N3: Acoustic Separation<br>N4: Ventilation & ThermalDaylight access, screen glare avoidance, noise buffering, vent/heat source clearance.
  • Evaluation: For each constraint in priority order, specialized tools are invoked to retrieve data, which an LLM interprets to determine satisfaction.
    • Geometric/Numeric Tools: e.g., boundary_check(), pathfinding(), free_floor_area().
    • LLM Query Tools: e.g., size_check(), reach_check(), activity_support_check().
    • VLM Tools: e.g., visual_balance_check().
  • Refinement: For each unsatisfied constraint, the LLM proposes a targeted refinement action (reposition, reorient, resize) grounded in design principles (e.g., minimum 36" circulation path, 2-3' side clearance for a bed).
  • Termination: The loop proceeds through all constraints, skipping any that would violate higher-priority, already-satisfied ones. Tier 1 constraints are re-evaluated at the end to ensure foundational quality.

Empirical Validation / Results

  • Data: 30 professionally written interior design cases curated from Architectural Digest, spanning 10 room types and 30 unique personas (e.g., retired couple, child with autism, YouTuber).
  • Baselines: Compared against three representative LLM-based methods: Holodeck [Yang et al. 2024c], iDesign [Çelen et al. 2024], and LayoutVLM [Sun et al. 2025a].
  • Perceptual Study: A two-alternative forced-choice (2AFC) study with 30 participants evaluated which layout better satisfied the functional brief.

Table 2: 2AFC Study Results Comparing Our Method with Baselines

MethodPrompt% Ours Preferred
Holodeck [Yang et al. 2024c]Functional92.2
Parsed88.9
iDesign [Çelen et al. 2024]Functional94.4
Parsed98.9
LayoutVLM [Sun et al. 2025a]Functional96.7
Parsed94.4
Overall94.3
  • Ablation Study: Investigated the contribution of pipeline components.

Table III: 2AFC Study Results Comparing Against Ablations

Prompt FormatIterative UpdateEvaluation Tools% Ours Preferred
FunctionalNoNo83.3
ParsedNoNo83.3
FunctionalYesNo78.9
ParsedYesNo80.0
ParsedYesYesOurs

Key findings:

  1. Function2Scene is strongly preferred over all baselines under both original functional and parsed prompts.
  2. Iterative refinement without evaluation tools is detrimental, performing worse than no iteration.
  3. The full pipeline (parsed input + iterative update + evaluation tools) is critical for superior performance.

Theoretical and Practical Implications

  • Theoretical: The work demonstrates that explicit, customizable functional constraints combined with LLM-driven iterative refinement can overcome the limitations of purely implicit, data-driven or direct LLM generation approaches. It successfully revisits classical design rule formalization in the era of foundation models.
  • Practical: The framework aligns 3D scene synthesis closer to real interior design workflows, where the goal is to support human activities. It provides a blueprint for developing more human-centered AI design assistants. The constraint taxonomy and tool-augmented evaluation loop offer a reusable structure for incorporating functional reasoning into other spatial AI tasks.

Conclusion

Function2Scene presents a framework for generating 3D indoor layouts from functional specifications. By focusing on functionality, employing a comprehensive and customizable constraint taxonomy, and implementing an LLM-driven iterative check-and-repair pipeline, it produces higher-quality, more functional scenes than prior LLM-based methods.

Future Directions:

  1. Upstream Conversational Interface: Developing a dialogue system to help non-expert users articulate and refine their needs into detailed functional specifications.
  2. Enhanced Verification Tools: Incorporating more powerful tools like embodied simulation with articulated models, physically accurate lighting/acoustic estimation, or a richer DSL for semantic spatial requirements.
  3. Co-optimization of Architecture: Extending the framework to jointly optimize room shape, openings, and partitions alongside furniture placement, capturing the full scope of interior design.

Related papers