Innovative AI Applications: Crafting 3D-Printable Creations
Written on
In this article, we delve into how artificial intelligence can transform your creative ideas into real-world 3D-printed objects. Through the use of Midjourney and open-source projects such as Shap-E, MVDream, and threestudio, we will explore the process of designing and fabricating various 3D meshes.
If you've been following my previous articles, you'll know that I enjoy experimenting with AI in creative fields. My explorations have included generating images, composing music, and engaging in creative writing. Recently, I ventured into the realm of three-dimensional design, utilizing both commercial and open-source AI tools to create physical objects that can be printed using a 3D printer. In this article, I will guide you through the steps I took to design and print four distinct 3D models, all of which can be found in the appendix's 3D Gallery.
Overview
I will walk you through four experiments involving different tools for 3D object generation and their outcomes. The first involved using commercial tools: Midjourney to create 2D images and 3dMaker.ai to extract a 3D mesh file. The second experiment utilized an open-source AI model from OpenAI named Shape-E. The third experiment was conducted with MVDream, another open-source model, while the fourth combined MVDream with the open-source project threestudio.
For each experiment, I began with a text prompt, generated a 3D mesh, and then used Blender, an open-source desktop application, to refine and clean up the mesh. Desktop slicing applications like Ultimaker Cura and PrusaSlicer were employed to prepare and preview the meshes before printing them at local libraries. If you're interested in printing these objects, you can find them on my Thingiverse profile.
After discussing the four examples, I will touch on the societal implications and ethical considerations of generating 3D objects using AI, including ownership rights related to various services and systems. I will conclude with a summary of the insights I gained from my experiments.
Creating 3D Objects with Midjourney and 3dMaker.ai
For my first experiment, I utilized Midjourney, a commercial service that generates images from text prompts. To produce a 2D image, I logged into my Midjourney account and input the following prompt: “a simple geometric, 3D-printed sculpture, solid white plastic, on a small white plastic pedestal, gray background.” The service generated four thumbnail images.
All four images exhibited impressive quality, showcasing potential for 3D printing. I was particularly fond of the “Figure 8” image located in the top-left corner. Subsequently, I used Midjourney's upscale feature to create a larger version of this image.
Removing the Background with Clipdrop
The next phase involved removing the background from the 2D image to prepare for 3D mesh generation. While numerous AI models are capable of this task, I opted for a free service called Clipdrop from stability.ai, which performed admirably.
The sequence of images above illustrates the ease of using Clipdrop. I simply uploaded my source image and clicked the “Remove background” button. The service effectively eliminated the background while preserving the sculpture's details, allowing me to download the modified image.
Creating a 3D Mesh from a 2D Image using 3dMaker.ai
To prepare for 3D printing, I needed to convert the 2D image from Midjourney into a 3D mesh. While several open-source models can accomplish this, I found a commercial service, 3dMaker.ai, to be particularly effective for a fee.
To generate the 3D mesh, I visited 3dMaker.ai, created an account, and uploaded the modified 2D image. The service offers two quality options: standard quality for $25 and high quality for $40. Standard quality is intended for hard surfaces and organic, detailed designs, while high quality is aimed at organic and very detailed inputs. I opted for standard quality and clicked the “Generate” button. The results took approximately 30 minutes to complete, and I downloaded the model in OBJ format, a common format for 3D geometry. The output was impressive, capturing the original image's details and accurately constructing the back of the sculpture.
Remarkably, 3D Maker AI generated the details of the sculpture's back based solely on the front view image. Another example of output from this service will be provided later in the article.
3D Printing Services and Local Libraries
The final step in this experiment was to print the 3D mesh. While I do not own a 3D printer, I discovered that numerous public libraries in the Boston area offer free or low-cost 3D printing services. Residents can utilize these printers to create practical items such as cellphone cases, cookie cutters, and small decorative objects.
The libraries provide different access options. Some feature 3D printers in “makerspaces,” which are collaborative workspaces for creation, invention, and learning. Others offer an online printing service where residents can upload a 3D mesh for printing. The libraries vary in their services, such as maximum print sizes and available colors for printing materials.
Printing the 3D Object
To print the 3D mesh, I needed to prepare the file in the standard STL format; however, 3dMaker.ai does not export in that format. Therefore, I used Blender to open the OBJ file and export it as STL. Next, I utilized a slicer app called PrusaSlicer to import the STL file and preview the print.
I printed the model using dark gray filament at the Waltham Public Library’s Makerspace. In 3D printing, filament serves as the thermoplastic material for fused deposition modeling (FDM) printers, which is melted and extruded through the printer's nozzle to create objects layer by layer. Residents can print objects using the Waltham Library's 3D printers. My sample was printed on their Prusa MK3S printer with a 0.6mm nozzle.
Additionally, I printed a piece in white filament at the Woburn Public Library using their online form. Woburn allows prints up to 11" tall, but I limited mine to 6.5" due to print time constraints, as they do not accept jobs exceeding 10 hours. Unlike other libraries, Woburn charges for the material used, but they provide a monthly $5 credit. My final cost after applying the credit was $3.63. Here are the two printed pieces.
Both printed objects turned out excellently! The one from Waltham measures 4.75 inches tall and showcases many details from the original Midjourney image, although some stratification artifacts are visible due to the smaller scale. The free printing service in Waltham is limited to objects that take less than 8 hours to create. The Woburn print appears better at 6.5 inches tall, with fewer stratification artifacts, though a few dark marks are present on its surface.
Creating 3D Objects with Shape-E
In my next experiment, I employed the open-source AI model from OpenAI known as Shap-E, which generates 3D meshes from text prompts. The name Shap-E is a clever reference to OpenAI's DALL-E, which creates 2D images based on text prompts. Below are some example images from their research.
The Shap-E Model
In May 2023, OpenAI launched the Shap-E model for text-to-3D generation. This system is designed to create 3D meshes by generating parameters for implicit functions, enabling detailed 3D visuals. A two-step training process was employed, which first maps 3D assets to parameters, followed by a refinement phase using a conditional diffusion model. This approach allows for the efficient production of various 3D models from text prompts, with each sample taking around 13 seconds to generate on a GPU. OpenAI has made the source code and trained model weights available on GitHub under the MIT open-source license.
The system employs three AI models as described on OpenAI’s model card:
- transmitter: The encoder and corresponding projection layers that convert outputs into implicit neural representations.
- decoder: The final projection layer of the transmitter, smaller than the transmitter as it does not include parameters for encoding 3D assets. This is the minimal model required to convert diffusion outputs into implicit neural representations.
- text300M: The text-conditional latent diffusion model.
An image-conditional latent diffusion model also exists, but I did not utilize it for this project.
Running Shap-E in Python
I employed Google Colab to create a 3D mesh from a text prompt. Below is the Python code demonstrating how I initialized the system.
from shap_e.diffusion.gaussian_diffusion import diffusion_from_config
from shap_e.models.download import load_model, load_config
import torch
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
xm = load_model('transmitter', device=device)
model = load_model('text300M', device=device)
diffusion = diffusion_from_config(load_config('diffusion'))
This code downloads the three models: the transmitter, decoder, and text diffusion model, loading them onto the GPU if available.
Using Shap-E to Create a 3D Object
I executed the following code to input a text prompt into Shap-E to generate the latent parameters for a 3D shape.
from shap_e.diffusion.sample import sample_latents
from shap_e.util.notebooks import create_pan_cameras, decode_latent_images
import numpy as np
import random
prompt = "a dolphin"
seed = 0
batch_size = 1
guidance_scale = 15
render_mode = 'nerf'
size = 512
torch.manual_seed(seed)
np.random.seed(seed)
random.seed(seed)
latents = sample_latents(
batch_size=batch_size, model=model, diffusion=diffusion,
guidance_scale=guidance_scale, model_kwargs=dict(texts=[prompt] * batch_size),
progress=True, clip_denoised=True, use_fp16=True, use_karras=True,
karras_steps=64, sigma_min=1e-3, sigma_max=160, s_churn=0
)
cameras = create_pan_cameras(size, device)
for i, latent in enumerate(latents):
images = decode_latent_images(xm, latent, cameras, rendering_mode=render_mode)
display(gif_widget(images))
This code demonstrates how I used the prompt “a dolphin” to sample latent parameters via Shap-E. By setting the random seed to 0, I ensured consistent rendering of the same mesh. Altering the seed would yield different variations. Below is the resulting image.
While the rendering appears basic and has considerable empty space, it distinctly resembles a dolphin. Below is the code for exporting the 3D mesh.
from shap_e.util.notebooks import decode_latent_mesh
for i, latent in enumerate(latents):
t = decode_latent_mesh(xm, latent).tri_mesh()
with open(f'example_mesh_{i}.ply', 'wb') as f:
t.write_ply(f)
This code utilizes the latent parameters to create a mesh in the PLY file format, another common format for 3D objects. Below is the appearance of the mesh in Blender.
The mesh still resembles a dolphin; however, it has several issues. The object's body displays layered steps, and the fins appear fragmented into horizontal slices. Additionally, the mesh lacks a pedestal for display as a printed object. To resolve the layering issue, I applied a Displacement modifier in Blender, followed by a Smooth modifier to enhance the shapes. To create a pedestal, I reran Shap-E with the prompt, “a cylindrical pedestal in the style of an ocean wave,” and exported the mesh. I then positioned and combined both objects in Blender. Below is the modified mesh.
The sides of the dolphin have been smoothed, and the fins have been fixed. Shap-E effectively rendered the pedestal, requiring minimal adjustments in Blender. I only needed to add a thin cylinder to the base of the pedestal to ensure stability on a surface. The model was then saved as an STL file.
Printing the 3D Object
Due to the significant negative space beneath the dolphin, the slicer app introduced temporary support columns to facilitate the printing of upper layers. Below are images illustrating this process.
I also printed this model at the Waltham Public Library’s Makerspace on their Prusa MK3S printer. Here’s the final result.
This print turned out well, measuring 5 inches wide. However, it reveals stratified lines from the printing process, and the dorsal fin at the top lacks definition.
Creating 3D Objects with MVDream
For my next experiment, I explored MVDream, an open-source project that generates multiple 2D views of a 3D object from a text prompt, emphasizing “multiple views.” The authors provide a description of the model in their research paper.
The MVDream model aims to facilitate the 3D generation process, particularly in gaming and media industries. However, it has potential implications for generating violent and sexual content through third-party fine-tuning. Given its basis on the Stable Diffusion model, it may inherit biases and limitations leading to undesired outcomes. Therefore, the creators advise careful examination of any images or models produced using their approach.
I executed MVDream in a Google Colab environment using the following Python code.
from mvdream.camera_utils import get_camera
from IPython.display import display
prompt = """a 3d-printed Cubist-styled sculpture of a male bust,
in light-gray plastic, on a simple light-gray pedestal,
dark-gray background"""
num_views = 4
seed = 12
set_seed(seed)
img = t2i(model, prompt=prompt, uc=uc, sampler=sampler, step=100, scale=10,
batch_size=num_views, ddim_eta=0.0, device=device, camera=camera,
num_frames=num_views, image_size=256, seed=seed)
images = np.concatenate(img, 1)
pil_image = Image.fromarray(images, 'RGB')
display(pil_image)
The code above illustrates how I used MVDream to generate multiple 2D views of a 3D object based on my text prompt. I specified the desired attributes of the object (a Cubist-styled sculpture of a male bust) and configured the model to produce four distinct views. The system then processed these views to create a single image that combines them side-by-side.
The images above depict four angles of a 3D-printed Cubist-style male bust, each view consistent with the others, showcasing the model’s ability to interpret and visualize the text prompt accurately. The sculpture's texture suggests a granularity typical of 3D printing, and its placement on a pedestal makes it appear ready for display.
Creating a 3D Mesh from Four 2D Images using 3dMaker.ai
I again employed the 3D Maker AI service to generate a 3D mesh from the four images. Here are four views of the mesh rendered in Blender.
The output from 3D Maker AI closely resembles the original, but some differences are evident. The output from MVDream features a head with more pronounced angular planes, especially around the facial facets characteristic of the Cubist style referenced in the prompt. Conversely, the rendered mesh from 3D Maker AI appears smoother, with more gradual transitions between the facial features and the planes of the head. To achieve a more artistic effect, I utilized the polygon reduction setting in the Prusa slicer to accentuate the head's angularity.
The difference is subtle, but the head now has more distinct angular triangles than the previous version, reflecting the Cubist influence.
Printing the 3D Model
I printed this mesh at the Watertown Free Public Library’s Hatch Makerspace. They employ a different slicer, Unitmaker Cura, but the printer remains a Prusa i3 MK3. I used tree supports connected to the build plate to minimize the attachment points on the finished piece.
The tree supports look particularly unusual with this 3D piece. However, the software required a method to support the brow's overhang during construction. Removing the supports was straightforward due to the minimal attachment points. Below is the final piece.
This piece turned out well, exhibiting a Cubist style through its angular and faceted surfaces. Cast in a uniform blue, the sculpture emphasizes geometric form over detail, with sharp planes defining the facial contours.
Creating 3D Objects with threestudio and MVDream
While the MVDream model efficiently renders multiple views of a 3D object from a text prompt, it does not generate an actual 3D mesh. This is where threestudio comes into play. Threestudio is an open-source project providing a modular framework that allows users to experiment with various text-to-3D and image-to-3D components, including MVDream.
The authors describe this framework as a unified and modular solution designed for 3D content generation. It extends diffusion-based 2D image generation models to include guidance for 3D generation while incorporating conditions such as text and images. They detail the modular architecture and design of each component within threestudio, also re-implementing state-of-the-art methods for 3D generation.
Threestudio’s pipeline for generating 3D content based on text or images consists of several essential components. The process begins with generating random camera parameters for optimization, including extrinsic and intrinsic properties, as well as lighting conditions. Geometry defines the 3D object or scene using representations like Implicit Signed Distance Function (SDF) and Implicit Density Field. Materials determine the object's appearance under various conditions, employing Diffuse and Physically Based Rendering (PBR) types. Background creation options include Neural Environment Maps, Textured Maps, or Solid Colors. The rendering is executed by various rasterizers, which consider the geometry and materials to produce the final image. The guidance from diffusion models, such as DeepFloyd-IF and Stable Diffusion, utilizes text or image inputs to steer the optimization process toward generating the desired 3D content.
Using threestudio and MVDream to Create a 3D Object
I operated threestudio with MVDream in Google Colab; however, the model required a GPU with more than 16 GB of VRAM. Therefore, it can only run with a Colab Pro subscription to access an A100 GPU.
After installing threestudio and the MVDream extension, I employed the following code to create a 3D object based on a text prompt.
prompt = """a 3d-printed abstract sculpture with geometric shapes,
in light-gray plastic, on a simple pedestal"""!python launch.py --config custom/threestudio-mvdream/configs/mvdream-sd21.yaml
—train --gpu 0 system.prompt_processor.prompt="$prompt" seed=42
I defined the prompt and executed the launch.py script, indicating the use of the MVDream configuration file. The operation referred to as “training” involves running an optimization loop to create a checkpoint file that outlines the 3D geometry based on the text prompt. I set the seed to 42 to ensure consistent output; modifying the seed number would generate different variations. The script took approximately 40 minutes to complete.
Creating a 3D Mesh from the Trained Checkpoint
During the training optimization, the system renders images showing the status of the shape being formed. Below is the resultant 3D object based on the prompt, “a 3d-printed abstract sculpture with geometric shapes, in light-gray plastic, on a simple pedestal.”
The threestudio system using MVDream produced a sculpture featuring three stacked triangular forms precariously balanced on a pedestal, with surfaces appearing weathered.
Printing the 3D Object
The next step involved exporting a 3D mesh that defines the shape. Here’s the command I executed for this purpose.
!python launch.py --config "$save_dir/../configs/parsed.yaml" --export
—gpu 0 resume="$save_dir/../ckpts/last.ckpt"
system.exporter_type=mesh-exporter
system.geometry.isosurface_method=mc-cpu
system.geometry.isosurface_resolution=256
system.exporter.save_texture=False system.exporter.fmt=obj
This Python script executes the export command using the previously generated checkpoint file, specifying the use of the mesh exporter. The “MC” method refers to the “marching cubes” technique for rendering, which produces a higher resolution mesh for a more detailed model. I indicated that I only required the OBJ file and not a texture map for efficiency. This step took 81 seconds to complete. Below is the rendered 3D mesh in Blender.
The mesh appears similar to the rendering above but is bumpier and has some extra material at the bottom. I refined the mesh in Blender, removing most of the pedestal and adding a tapered cube as a replacement.
Printing the 3D Model
I printed this mesh at the Hatch Makerspace in Watertown using their Prusa i3 MK3. I employed standard supports attached to the lower part of the mesh.
The images above display the settings I used for the Cura Slicer, the printing process, and the finished piece prior to removing the supports. Below is a picture of the completed object.
This sculpture also turned out well, displaying a bumpy texture with some residual support remnants. Overall, it looks satisfactory. If I were to print it again, I would opt for tree supports that only connect to the build plate.
Having shared my journey of creating four 3D objects using AI, I will now address two crucial aspects: the societal implications of 3D object generation using AI and the ownership rights associated with the systems I utilized.
Societal Impact of 3D Object Generation Systems
The advent of 3D object generation technology represents a significant milestone, enabling individuals to convert their ideas into physical models. Through the use of commercial and open-source tools, this process adds a new dimension to creativity and manufacturing. However, it is essential to consider the societal implications accompanying these innovations, as they exemplify how technological advancements can yield both positive and negative effects.
Societal Impact of Shap-E
In their study, OpenAI examines biases present in their training dataset, which may influence the models' behaviors. They explored bias within their text-to-3D model by presenting ambiguous captions that left specific details, such as body shape or color, unspecified. The results revealed that some generated samples reflected common gender-role stereotypes in response to these ambiguous prompts. More details can be found in Appendix C of OpenAI’s research paper.
Societal Impact of MVDream
The creators of MVDream also discussed the societal ramifications of their model in their paper.
> "The multi-view diffusion model proposed in this paper aims to facilitate the 3D generation task that is widely demanded in the gaming and media industries. We do note that it could potentially be applied to unexpected scenarios such as generating violent and sexual content via third-party fine-tuning. Built upon the Stable Diffusion model, it might also inherit biases and limitations that lead to unwanted results. Therefore, we believe that the images or models synthesized using our approach should be carefully examined and presented as synthetic. Such generative models may also have the potential to displace creative workers through automation. However, these tools may also foster growth and improve accessibility in the creative industry." — Yichun Shi et al.
The need for a careful balance between leveraging AI's potential for innovation and addressing the ethical, cultural, and economic ramifications accompanying its rise is crucial in navigating the societal impacts of text-to-3D models.
Ownership Rights of 3D Object Generation Systems
Ownership rights are a critical aspect of media generation, necessitating careful examination of the terms of service for each tool to understand the rights granted to creators.
Ownership Rights of Midjourney Users
Midjourney has recently revised its policy regarding the ownership of images generated through its service. Previously, a paid subscription was required for users to own their created images, but this requirement has been relaxed for individual users. Below is the updated policy:
> "You own all Assets You create with the Services to the fullest extent possible under applicable law. There are some exceptions: > - Your ownership is subject to any obligations imposed by this Agreement and the rights of any third parties. > - If you are a company or any employee of a company with more than $1,000,000 USD a year in revenue, you must subscribe to a “Pro” or “Mega” plan to own Your Assets. > - If you upscale the images of others, those images remain owned by the original creators." — Midjourney
Thus, individual users retain ownership of their generated images. However, employees of companies earning over $1 million must pay $60 monthly to retain ownership of their images. Further pricing details can be found on their website.
Ownership Rights of 3dMaker.AI Users
The ownership rights associated with 3dMaker.AI are straightforward. The FAQ on their site states, “The models generated from 3dMaker.AI are 100% yours.” No legal interpretation is required!
Summary
In this project, I explored the process of transforming ideas into 3D-printable objects using both commercial and open-source AI tools. Beginning with generating 2D images through Midjourney and converting them into 3D models via 3dMaker.ai, I navigated the creative journey from digital conception to physical creation. Open-source models such as Shape-E, MVDream, and threestudio expanded the possibilities further, enabling direct text-to-3D transformations.
This process involved refining generated models in Blender, preparing them for printing, and ultimately bringing them to life using 3D printers at local libraries. My journey showcased technological advancements in AI and 3D printing while highlighting the importance of considering societal impacts and understanding the ownership rights associated with these emerging tools.
As I navigated the creation and printing of 3D objects, this project underscored the confluence of innovation and accessibility in the field, prompting reflections on the ethical and practical implications of employing AI in creative processes.
Source Code
The source code for this project is available on GitHub, and the 3D designs have been posted on both Sketchfab and Thingiverse. I am releasing the code and designs under the Creative Commons Attribution Sharealike license.
Acknowledgments
I extend my gratitude to Jennifer Lim for reviewing this article and providing valuable feedback. My thanks also go to the staff at the Waltham Public Library Makerspace, the Watertown Free Public Library’s Hatch Makerspace, and the Woburn Public Library for their assistance in printing the objects featured in this article.
References
[1] Heewoo Jun and Alex Nicho, Shap·E: Generating Conditional 3D Implicit Functions (2023)
[2] Yichun Shi et al., MVDream: Multi-view Diffusion for 3D Generation (2023)
[3] Ying-Tian Liu et al., threestudio: a modular framework for diffusion-guided 3D generation (2023)
[4] Rombach et al., High-Resolution Image Synthesis with Latent Diffusion Models (2022)
Appendix A: 3D Gallery
Here are the 3D objects I created for this article. Feel free to interact with them or download the STL files.