Project Wonderland

APR.2025 - AR / 3D AIGC

A speculative AR experience project that allows users to reconstruct any real-world object into any form using AIGC technology. Technically, with the support of AI algorithms and high-performance servers, we can achieve real-time digital model remeshing. Conceptually, the project explores how technology expands human capabilities in the context of posthumanism through the fusion of humans and algorithms, while also addressing the hidden issue of algorithmic power.

Overview:

Project Wonderland is built on the foundation of Text2Mesh. After obtaining digital models of physical objects through the headset’s camera scan (currently substituted with Scaniverse), we utilize Text2Mesh’s neural network learning capabilities to stylize the geometry and color details of specified 3D models through text input (demonstrated using speech-to-text). Then, using the Meta Quest 3 or other headsets with spatial scanning and localization capabilities, the system can overlay the stylized model onto the original physical object in the AR environment, effectively achieving a reconstruction of reality.

Although this project proposes a real-time interactive workflow, due to the current limitations of graphics processing power, the remeshing process can only be compressed to approximately 10 minutes at best. Nevertheless, the project speculatively envisions a future direction for the integration of AR and AI—two of the most prominent emerging technologies—in the next year or two.

Conceptually, from a positive perspective, this project expands human perception and creativity. With the help of AR and AI, we are able to "transform" what we see, and this ability stimulates imagination and subjective agency. However, from a critical perspective, the project also reveals the implicit power of media and algorithms: it reconstructs and guides how users perceive reality; while enabling user creativity, the algorithm also subtly limits and shapes the outcomes.

Text2Mesh:

Text2Mesh produces color and geometric details over a variety of source meshes, driven by a target text prompt. Its stylization results coherently blend unique and ostensibly unrelated combinations of text, capturing both global semantics and part-aware attributes. After setting up the local runtime environment, simply use the provided command template and replace it with any model text prompt you want.


python main.py --run branch 
--obj path data/source meshes/studio sofa.obj 
--output dir_results/demo/sofa/sofa1 
--prompt a 3d model of mushroom sofa 
--sigma 5.0 --clamp tanh --n_normaugs 4 
--n_augs 1 -normmincrop 0.1 --normmnaxcrop 0.1 
--geoloss --colordepth 2 --normdepth 2 
--frontview --frontview_std 4 --clipava view 
--lr decay 0.9 --clamp tanh --normclamp tanh 
--maxcrop 1.0 --save render --seed 131 
--n iter 1500 --learning_rate 0.0005 
--normal_learning_rate 0.0005 --background 111 
--frontview_center 1.96349 0.6283

A pink mushroom sofa

A cozy sofa completely covered with flowers and lush green leaves

A fluffy, plush chair designed in the shape of a white rabbit with striking red eyes

A refrigerator stands uniquely, its surface completely adorned with delicate pink flowers and vibrant green leaves

Spatial Anchor:

Meta Quest 3’s accurate depth projection and room mapping capabilities allow for flexible use of its spatial anchor and room scanning features. Spatial anchors provide positioning and anchoring for virtual objects in the real world within an AR environment. This project utilizes spatial anchors to position the remeshed models, ensuring the effect of “overlaying” the original objects. The data obtained from room scanning—such as floor, walls, surfaces, and obstacles—helps establish a more realistic model placement logic, enabling an AR experience that adheres to the logic of the physical world.

Integration:

This project is developed using the Unity engine. After integrating the model, anchor points, and room data into the project backend, the program reads the user’s spatial position and monitors whether the Trigger button is pressed to determine whether to initiate the remesh process. When the user approaches a pre-set and pre-calculated spatial anchor of a real-world object and presses the Trigger button, they can input a command via voice. The program will then run Text2Mesh (for demonstration purposes, due to previously mentioned time constraints, a pre-remeshed model is currently used). Ultimately, the user will see the object reconstructed in its new form.

Next Step:

Although this is a speculative project, the experience proposed— as described above—has already been realized at the software level. What we need now is only more powerful hardware, which is often just one to two years away.
The concepts of reality reconstruction and remeshing can be extended to many other scenarios. Most existing AI-driven 3D model generation tools on the market are still limited to text-to-model or image-to-model workflows. In contrast, this project proposes a model-to-model paradigm. This approach can be easily integrated into various 3D digital production or virtual production workflows, such as filmmaking or AR applications.

LET’S CHAT!

INSTAGRAM / EMAIL / LINKEDIN