Dream Canvas

Image and video reconstruction based on idea of “scrapbooking”.


2024
For: Harvard GSD 6365 Enactive Design
Role: Software Developer






Research Question


Scrapbooking is the method of preserving, presenting, and arranging personal and family history in the form of a book, box, or card.

It is incredibly hands-on, creative, and personal.

How can modern image generation tool borrow from the conept of “scrapbooking”?





Initial Tests

Simple stack of images

Input Prompt: “a cat on a rooftop with a train on the right of the cat”

Enriched Prompt:
"A curious cat sits on a rooftop, its tail curled, gazing towards a train passing by on the right side. The scene is set under a slightly cloudy sky, with the train’s metal frame glinting in the sunlight. The rooftop has old, weathered tiles, and the surroundings include distant buildings and power lines, adding an urban atmosphere."

Image Preprocessing for Img2Img
Need to find the best way to dissolve the sharp boundaries of the input images, so that these edges doesn’t dominate the final composition.


Method 1:
Uncanny Edge







Input image’s boarder persists

Method 2:
Segmentation





Often results in sharp edges and boxiness

Method 3:
Saliency with OpenCV





Saliency is able to blur out the boarder and harsh linear edges.




Positional Coordinates from Canvas:
In order to make the user experience intuitive, the positional quality of where the image is finally placed on the canvas should directly influence the prompt. This is done by converting x,y coordinates of each images center point into natural language. 




Various Tests and Insights:

White background often have a strong influence on the final stable diffusion output.



When the saliency map is overly segmented, it can result in floating objects in the final composition.

With a black background and using sliency map to mask rather than crop resulted in a better output..






Pipeline




Demos from Users

The style from each set of images used is surprisingly similar and has incredible consistence across multiple generations.







Kevin Tang © 2019 - 2024. All Rights Reserved