Machine Learning and Neural Networks - page 24

 

MIT 6.S192 - Lecture 8: "How Machine Learning Can Benefit Human Creators" by Rebecca Fiebrink



MIT 6.S192 - Lecture 8: "How Machine Learning Can Benefit Human Creators" by Rebecca Fiebrink

Rebecca Fiebrink, a researcher in the area of music and AI, emphasizes the importance of human interaction and keeping humans in the loop in the use and development of machine learning for creative purposes. She discusses her tool, Wekinator, which enables the use of machine learning in real-time music for human creation. She demonstrates building various gesture-controlled instruments such as a drum machine, a sound synthesis algorithm called Blotar, and a wind instrument called blowtar. She highlights how machine learning can be beneficial for creators allowing them to explore complex and nuanced sound palettes and make data analysis easier for sensors and real-time data. She also addresses the benefits of interactively manipulating training data and explains how machine learning enables us to communicate with computers in a more natural way, besides adding surprises and challenges to the creative work process.

  • 00:00:00 In this section of the video, Rebecca Fiebrink, a researcher in the area of music and artificial intelligence (AI), discusses the importance of human interaction and keeping humans in the loop in the development and use of machine learning for creative purposes. She questions the assumption that using machine learning to autonomously generate human-like creative output is in itself a support for human creators. Fiebrink's research has expanded to other areas such as art practice and games, and she emphasizes the need to think about the theoretical and practical usefulness of machine learning for human creators.

  • 00:05:00 In this section, the speaker discusses the gap in the toolset available to creators wanting to work with data or machine learning in music and art fields. While several people were using c plus libraries or Python, there were hardly any available tools to use in real-time or to work with media data, especially sound data. Many creators had already obtained PhDs in computer science or electrical engineering, and there was room for more accessible tools for creators who wanted to work with data. Machine learning can be a great tool for creators who want to make sense of diverse types of data that surround them, such as online repositories or online sources like Google images, biosensors, or social media data.

  • 00:10:00 In this section, Rebecca Fiebrink explains about her work in building a piece of software called Wekinator that enables the use of machine learning in real-time music in human creations. She highlights that building a new instrument that responds to gestures is different from working with off-the-shelf ground truth training sets. To make things easier, Wekinator allows users to demonstrate examples for training in real-time and then test the model to see where it makes mistakes. Wekinator also allows the users to modify the training examples on the spot. She then demonstrated building a very simple gesture-controlled drum machine using Wekinator software that uses a webcam to capture motion and down-samples input to a 10 by 10 color grid that gives 100 numbers to make it easier to predict gestures or motion.

  • 00:15:00 In this section, the speaker demonstrates how to use Wekinator with regression to create a music instrument that controls a sound synthesis algorithm called Blotar. This instrument allows a user to control a big space of sounds, including many different presets, changing nine control parameters. The speaker shows how machine learning can benefit professional composers by enabling them to explore complex and nuanced sound palettes.

  • 00:20:00 In this section, Rebecca Fiebrink demonstrates how she uses machine learning to control a wind instrument called the blowtar using a game controller. She explains that manually finding good positions in the nine-dimensional space of the instrument would be challenging for an expert programmer, but machine learning allows for complex functions to be written easily. She shows how through training the system, the instrument can be refined until it produces the desired result, and it can be saved and used during performances or in composing work. As a researcher of the tool, Fiebrink discusses the various ways creators have used machine learning to improve their work, and what this teaches.

  • 00:25:00 In this section, the speaker discusses how machine learning can benefit creators and enable more people to work with data effectively, especially with sensors and real-time data analysis, using examples such as Anne Hege's composition through Wekinator and Michelle Nagai's music instrument. They also highlight how machine learning can make building interactions creative and easier with Wekinator's uses in fields like art, puppet shows, technologies for people with disabilities, and designing interactive prototypes. The author explains that building interactions creatively through machine learning usually require a different approach than conventional machine learning due to its goals of building a model that generates believable outputs, and how the model behaves when it doesn't meet the purposes becomes a challenge.

  • 00:30:00 In this section, Fiebrink explores the differences between building a machine learning model with the goal of making accurate predictions, versus building an interactive machine learning model with the goal of building something useful or fun. When building an interactive machine learning model, the data is thought of as an interface for communication between a creator and the computer, meaning that the data is chosen subjectively and is unlikely to be independent and identically distributed (iid), which is a common assumption in machine learning. This can lead to learning from very few strategically placed examples. Fiebrink demonstrates how a simple algorithm like k nearest neighbor, when used interactively, can still produce good decision boundaries with a small amount of data, allowing for hands-on experimentation and data curation.

  • 00:35:00 In this section, Rebecca Fiebrink discusses the benefits of interactively manipulating the training data in creative domains. She explains that allowing people to explore many alternative ideas is essential for creating something that satisfies the design requirements. Fiebrink found that using machine learning algorithms, such as Wekinator, enables people to retrain models very quickly and see the results immediately, making it possible to support rapid prototyping very effectively. She also notes that it is challenging to capture human practices or actions in code, even for expert programmers, in domains such as painting or playing musical instruments.

  • 00:40:00 In this section, Rebecca Fiebrink explains how machine learning enables us to communicate with computers in a more natural way, as it allows people to communicate their ideas in terms of examples, which is similar to how we communicate when talking about creative activities with each other. Machine learning also makes it easier for novices to create by leveraging big data sets to conform to a standard. However, Fiebrink's recent project, called Sound Control, shows the possibility of allowing more people to personalize interfaces and make things for themselves and others with machine learning. In collaboration with music teachers and therapists, Sound Control enables them to make bespoke instruments for kids, but it also led them to do other unexpected and useful things, such as making listening games, improvisation games, and performance activities.

  • 00:45:00 In this section, Rebecca Fiebrink discusses how machine learning can provide productive surprises and challenges in the creative work process. Using tools like Wekinator, she emphasizes the importance of having creative tools that add unexpected ideas into the working process. Thus, she warns against overlooking other types of machine learning or even non-machine learning methods of working with data. She suggests that building with data and machine learning can enable people to do things they couldn't before and explores how creative applications can serve as case studies to make people's other experiences with data and machine learning more empowering.

  • 00:50:00 In this section, the speaker addresses a question from the audience about the challenges of using machine learning with sound. The speaker acknowledges that sound presents some unique challenges in terms of cultural subjectivity, but overall, sound can be approached using typical machine learning processes with similar results to other media. The speaker emphasizes that data and how it is used to address problem domains is more important than the medium itself. The speaker also discusses how machine learning can be used as an interface to create things and the importance of discussing human alignment with machines and who should define the objectives.

  • 00:55:00 In this section, the speaker discusses the difficulty in defining an objective for machine learning and how much of it is an experimental process where the creator creates a dataset, tries something out, and then uses the data to steer the model towards a certain direction. The experiential aspect of the process allows the creator to learn about machine learning in a specific context through trial and error, and this aspect can be a powerful tool for people to learn about machine learning. Recent research by Carrie Cai and others also shows that similar experimental exploratory procedures can help people build trust and understand what is being modeled, even in applications where those people may not have prior machine learning expertise.
 

MIT 6.S192 - Lecture 9: "Neural Abstractions" by Tom White



MIT 6.S192 - Lecture 9: "Neural Abstractions" by Tom White

In this video, artist and lecturer Tom White discusses his approach to incorporating machine perception and neural networks into his artistic practice. White shares his background in studying math and graphic design at MIT and his current work teaching creative coding at Victoria University. He also discusses his research on building tools to help others use the medium creatively and his own artwork that explores machine perception. White showcases his sketches and prints, created using AI algorithms, and talks about his collaborations with music groups and his recent art exhibitions. He also discusses the challenges of collaboration with neural networks and the unintended consequences of putting AI-generated art in the wild.

  • 00:00:00 In this section of the video, artist and lecturer Tom White introduces himself and talks about his background, which includes studying math and graphic design at MIT's Media Lab. He discusses his interest in exploring programming as a creative discipline and how he is now teaching creative coding at Victoria University in Wellington. White also mentions his research, which focuses on building practical tools to help others use the medium creatively. Additionally, he talks about his own separate arts practice, which he says he will be discussing more in his talk, and hopes to inspire students interested in pursuing similar paths.

  • 00:05:00 In this section, the speaker provides an outline for his talk on neural abstractions and his artwork that explores machine perception. He explains that machines have their own unique ways of seeing the world, and his artwork aims to expose this to a wider audience. The speaker also touches on the topic of AI representation and abstraction, and how he investigates the representations of neural net vision systems to convey them in an artistic context. He exemplifies this by showing a few of his artwork pieces based on datasets of actual images, such as eyes, faces, and chickens, and how he introduces diagnostics into the process to understand the system's inner world. The talk concludes with the implications of exploring machine perception in art and how it can help us appreciate the different ways machines perceive the world.

  • 00:10:00 In this section, Tom White discusses some of his initial projects during his time at MIT, including his exploration of machine learning techniques for creating real-time video filters, his creation of a custom hand interface for multi-touch interaction, and his art project Stream of Consciousness, which incorporated AI techniques such as WordNet to find related words. White also talks about his involvement in the creation of the core software library Acu, which later served as the basis for systems such as Processing and OpenFrameworks, and how his current work involves creating sketches and drawings for machine learning processes.

  • 00:15:00 In this section, the speaker discusses the precedence in art that has inspired their work, starting with the artist Stuart Davis, who took common objects and forced himself to paint them over and over until he found something new in them. Harold Cohen was another artist who experimented with generative drawing systems by codifying his ideas on mark making in a formal way through artificial intelligence. Working more as a collaborator with these systems later in life, Cohen's core question remained "what is an image?" The speaker then talks about the technical side of Andy Warhol and Roy Lichtenstein's work in screen printing as a technique they share in executing their artwork.

  • 00:20:00 In this section, the artist and lecturer Tom White discusses his artistic technique for creating prints using screen printing instead of a brush technique, which he creates using a computer vision system that optimizes perceptually for creating images that look like electric fans or binoculars using artificial intelligence algorithms. White discusses how Stuart Davis learned to perceive and represent familiar objects in new ways by staring at the same objects every day. In a similar vein, White seeks to use computer vision systems to introduce new ways of perceiving and representing familiar objects.

  • 00:25:00 In this section of the video, the speaker discusses his demos of using a neural network system to create simple sketches using very few strokes which can manipulate to create different images. He explains how he created sketches of a hammerhead shark and an iron using the same number of strokes, and then shows that by flipping the position of the strokes he can trick the neural networks to see an iron as a shark and vice versa. The speaker demonstrates how the neural network can create sketches of different objects and shows how the system is not affected by left or right-handed orientation but is influenced by colors in the training dataset provided.

  • 00:30:00 In this section, Tom White talks about different examples of machine learning and how they work. One example is a computer vision system that uses a sample of measuring cups that are predominantly green, making the system believe that green measuring cups are more common than they actually are. White also discusses a print he made of a tick that registered stronger than all of the validation examples, which he compares to art and design where amplification through simplification is used to create a better abstraction of concepts. Finally, White presents his synthetic abstractions series, which consists of abstract prints that mimic explicit or unsafe for work images that trigger filters in search engines.

  • 00:35:00 In this section, the speaker shares examples of how his systems work with online APIs, including data sets for whales, penguins, and eyes. He also discusses his collaboration with a music group where he created custom data sets, as well as his recent art exhibitions featuring groups of images that the computer thinks are knots, ants, or other objects. The speaker goes on to talk about different approaches to generative techniques and how his artwork impacts the real world. He mentions his interest in gender networks and how he created an artwork using neural net outputs of faces.

  • 00:40:00 In this section, Tom White talks about his exploration of generative networks and his work with graduate students to make a spreadsheet tool that uses samples from a generative model as a creativity tool through the interface of a spreadsheet. Other artists like Lena Sarin, Mario Klingemann, Robbie Barrett, and Edmund Bellamy are also mentioned. White also discusses the challenges of collaboration with these systems for art making, emphasizing the role of both the artist and the system in the co-creation process. Finally, he talks about the unintended consequences of putting AI-generated art in the wild and how we can understand it through visualization techniques and by asking the system what it sees.

  • 00:45:00 In this section, the speaker talks about techniques similar to deep dream where an image is fed into a system to visualize how it relates. The speaker mentions how their art pieces bump into real-world systems like Tumblr's adult content filter, Amazon API, and Sloan Kettering's academic offices. They also discuss examples of how these vision systems collapse in categorizing art pieces under the same label they have for the real-world objects. The speaker explains that the core idea of their artwork is to understand it through the eyes of machines, which creates art for and by machines, allowing people to appreciate it regardless of their background in machine learning.

  • 00:50:00 In this section, Tom White explains why he chose screen printing as his medium of choice for his physical art pieces. He highlights that physical work allows people to relate to it differently from interactive installations with screens and cameras. He also explains that screen printing enables him to create more precise work, and it has a precedent for pop artists in the art world. Tom further explains that making physical work is more difficult to pull off as it is challenging to deal with possible photos, but it is an interesting way to take adversarial attacks to the physical world. Additionally, he talks about how art can help in better understanding algorithmic bias or other aspects of AI and cybersecurity.

  • 00:55:00 In this section, Tom White discusses how bias in the Celeb-A dataset, with women being more likely to be labeled as smiling than men, can lead to bias in generative networks aimed at modifying facial expressions. He notes that his work is not focused on adversarial examples but rather on visualizing and understanding the stimuli that trigger neural networks. White also talks about experimenting with simple representations, such as minimal strokes, to make the generation of visual outputs easier. He notes that people can recognize images in low-resolution formats, drawing inspiration from psychology research that tested this ability.

  • 01:00:00 In this section, Tom White encourages viewers to check out the research in the neural abstractions space and directs them to the videos from the previous year's workshop for more information. He emphasizes the value of the research and welcomes any questions viewers may have.
 

MIT 6.S192 - Lecture 10: "Magenta: Empowering creative agency with machine learning" by Jesse Engel



MIT 6.S192 - Lecture 10: "Magenta: Empowering creative agency with machine learning" by Jesse Engel

Jesse Engel, lead research scientist at Google Brain, discusses Magenta, a research group looking at the role of AI and machine learning in creativity and music. The group primarily focuses on machine learning models that generate media and makes them accessible through open-source code and a framework called magenta.js, which allows for the creation of interactive creative models in Javascript. Engel emphasizes the importance of viewing music as a social and evolutionary platform for cultural identity and connection rather than a commodity to be cheaply produced and consumed. They explore how machine learning can empower individuals with new forms of creative agency through expressivity, interactivity, and adaptivity. The lecture covers various topics, including designing machine learning models for music, using dilated convolution for predictive outputs, differentiable digital signal processing, and creating machine learning systems that produce beautiful failures. Additionally, he talks about collaborative challenges with artists and the grand challenge of coming out of distribution and compositionality in learning models.

  • 00:00:00 In this section, Jesse Engel, lead research scientist at Google Brain, discusses Magenta, a research group that looks at the role of AI and machine learning in creativity and music. The group primarily focuses on machine learning models that generate media and makes them accessible through open-source code and a framework called magenta.js, which allows for the creation of interactive creative models in Javascript. Engel emphasizes the importance of viewing music as a social and evolutionary platform for cultural identity and connection rather than a commodity to be cheaply produced and consumed. They explore how machine learning can empower individuals with new forms of creative agency through expressivity, interactivity, and adaptivity.

  • 00:05:00 In this section, Jesse Engel talks about designing machine learning models that are more hackable and require less data to train, specifically in the context of music. He discusses the trade-offs between different facets of designing algorithms, like making them low latency with intuitive causal controls, while still being expressive and adaptive. He compares two machine learning models--the openai Jukenbox that models raw audio waveform very realistically at the expense of requiring tons and tons of data and doodles that model music as structured data but with unrealistic sounds. He ends by discussing the approach the group is taking, which is to use structure within the model to compromise between interpretability and expressivity.

  • 00:10:00 In this section, Jesse Engel discusses the previous state of the art of audio transcription models and how they were limited when it came to accurately predicting notes in a way that matched human perception. He demonstrates how errors in individual frames don't matter as much as when notes actually start, and how a new neural network architecture was created in order to better match the loss function to what we care about - how the music sounds when we play it back. The new state of the art model was able to achieve accurate transcription even when the audio was "in the wild," as demonstrated by a piano player playing into his cell phone.

  • 00:15:00 In this section of the video, Jesse Engel from Google Brain explains the importance of data sets in neural networks, using the example of a large data set from the international e-piano competition. He discusses the use of neural networks, such as recurrent neural networks (RNNs) and the transformer architecture, to model musical sequences and the challenge of tokenizing musical notes. To address this challenge, they created a vocabulary to recognize individual music events and timestamps. By accurately representing micro timing, velocity, and variations in the data, the models are able to produce more natural-sounding music.

  • 00:20:00 In this section of the lecture, Jesse Engel explains how the Magenta team started with just an original motif and used an autoregressive model called LSTM to predict the next token given the previous tokens. However, due to the LSTM's limited long-term coherence, they implemented the transformer to keep track of all previous data to improve the coherence. With this, they could transcribe raw audio to obtain thousands of hours of symbolic music, allowing them to train models that have much more long-term coherence. To give more intuitive control, the team also extracted the melody and used it as a control that the generation is dependent on. They could then use this model as a neural synthesizer for different sounds, and the parameters could be tuned to specific sound sets.

  • 00:25:00 In this section of the video, Jesse Engel explains the technical aspects of Magenta's dilated convolution processes for a neural network to predict outputs based on high-level controls. By using dilated convolution, the system is able to look at a large scope of time without down-sampling and avoids losing information while being expressive. However, the process is slow and requires longer-term conditioning for longer-term structure. Through the use of note conditioning, the system is able to generate realistic performances with interpretable intermediate representations.

  • 00:30:00 In this section, we learn about DDSP, or differentiable digital signal processing. Jesse Engel proposes integrating traditional signal processing methods such as oscillators, filters, and synthesizers with deep learning to create a more efficient, realistic, and responsive system. Instead of having a neural network create audio directly, known signal processing elements are used, and a neural network controls them to produce expressive outputs. DDSP modules are interpretable and efficient, and sound can be modeled by these variable frequency sinusoidal oscillators. DDSP uses harmonic oscillation and second-order differential equations for more flexibility with audio modeling. DDSP is not just periodic components but also includes noise elements, which can be randomly shaped with different filters. By controlling these synthesis elements using a neural network decoder, audio can be generated that compares favorably to the original audio.

  • 00:35:00 In this section of the lecture, the speaker explains how they can train the decoder to create high-quality synthesis with less data by running spectrograms through the model and then re-synthesizing it. This allows the model to turn pitch and loudness into a flute sound, violin sound and even transfer timbre tones from singing styles. They can also turn off different model components, such as reverb and harmonics, to inspect individual attributes. The model can be compressed down to models under one megabyte for real-time operation implementation on a browser. The DDSP model can apply to a broad range of cultures, making it able to preserve microtonal variations and shifts.

  • 00:40:00 In this section, Jesse Engel discusses the Magenta project and its goal of empowering creative agency using machine learning. He explains that they have received positive responses from musicians who find the tool helpful in their creative process rather than replacing it. The Magenta team is focused on creating a broader ecosystem, including a web interface for training models, deploying to web apps, and real-time plug-ins for music software. Engel notes that the system is more interactive, real-time, and adaptive, but there is still room for improvement in terms of expressivity and diverse interactive models. The team is exploring unsupervised models to learn the structure and labels from data. They have several demos, software, and professional tools available on their website for anyone to try out.

  • 00:45:00 In this section, Jesse Engel explains that creating machine learning systems that produce beautiful failures is one way of thinking about creating systems that artists can use. For example, the limitations designed into the original drum machines turned out to be their defining characteristic, which caused hip-hop and electronic musicians to use the sounds in fun and artistic ways. Additionally, Engel discusses the relationship between interpretability and interactivity and suggests that the language and assumptions used by machine learning models could be the solution to creating APIs that act as intermediaries between the software and the user for maximal interpretability.

  • 00:50:00 In this section of the video, Jesse Engel discusses the challenges of enforcing structure for generalization while designing models that can fit the target audience. He explains how neural networks can emulate Newtonian Mechanics in a specific set of images but struggle to extrapolate when one aspect of the image changes. He also touches on how building models that can adapt to the intensity of the music or the volume of the kick drum can be a fascinating idea. The discussion on collaborations with artists is also brought up, but Jesse explains that it is challenging due to the limitations and their research-based promotion system. The discussion ties into the grand challenge of coming out of distribution and compositionality in learning models.
 

MIT 6.S192 - Lecture 11: "Artificial Biodiversity", Sofia Crespo and Feileacan McCormick



MIT 6.S192 - Lecture 11: "Artificial Biodiversity", Sofia Crespo and Feileacan McCormick

In this lecture on "Artificial Biodiversity," Sofia Crespo and Feileacan McCormick explore the intersection of technology and nature to produce unique forms of art. The duo discusses their interest and use of machine learning and its connection to beauty and highlights the limitations of human perception. They also discuss their collaborative projects, including "Entangled Others," where they advocate for representing both individual species and their complex entanglements to create a better understanding of ecological systems. The speakers emphasize the importance of sustainability and collaboration in artistic practice and the relationship between tools and art, stating that algorithms cannot replace human artists.

  • 00:00:00 In this section, Sofia Crespo and Feileacan McCormick discuss the concept of artificial biodiversity and explore the question of what makes something beautiful in the realm of machine learning. The duo considers whether the beauty is found in the data set used to train neural networks, the process of training the model, or the interaction between layers of virtual neurons in the brain. They also draw parallels between the act of training a neural network and meditation, as both involve the curation of a data set and exploration of patterns. Overall, the discussion highlights the ways in which technology and nature can intersect to produce unique forms of art.

  • 00:05:00 In this section, Sofia Crespo discusses her fascination with jellyfish and the limitations of human perception in terms of color. She explains that her interest in jellyfish led her to explore synthetic jellyfish through machine learning algorithms. She ponders the question of what artificial neural networks can teach us about our cognitive processes and the concept of "nature-ness" and how to visualize it. Crespo also discusses the paper by Aaron Hertzmann on visual indeterminacy in gan art, which explores how meaningful visual stimuli can be visually indeterminate and trigger cognitive responses.

  • 00:10:00 In this section, the speakers discuss their interest and use of machine learning and its connection to beauty. They explain that when working with machine learning, they work within a very human sphere, utilizing human-created datasets, and therefore, addressing human visual assumptions of nature. The speakers suggest that technology is a part of nature since humans are a part of nature, and this idea of technology being a separate entity from nature is flawed. Additionally, the speakers discuss the definition of artificial life and highlight that it can be understood in various disciplines such as software, art, or even wetware, hardware, and genetics. They use Karl Sim's work on evolved artificial creatures to demonstrate the ability of primitives to embody life-like qualities, and along with their behavior, emerge a sense of competitiveness and goal-oriented actions.

  • 00:15:00 In this section, we learn how artificial neural networks can create fantastical creatures and language, much like the Codex Seraphinianus by Luigi Serafini. These creations are a remixed recombination of human knowledge of botany, zoology, language, and architecture. Despite their artificiality, they show a remarkable diversity in diversity. The lecture also discusses Anna Atkins, a photographer and botanist from the 19th century who created the cyanotype technique. The speaker combined Atkins' technique with the convolutional neural network to generate life-like creatures, which were printed using the cyanotype technique. This project is called Artificial Natural History, a book that showcases how humans saw nature before the existence of cameras.

  • 00:20:00 In this section, Sofia Crespo and Feileacan McCormick discuss their collaborative project, "Entangled Others," where they advocate for representing not only individual species, but also their complex entanglements to create a better understanding of ecological systems. They explain their first project, "Artificial Remnants," where they generated 3D models of insects and created an augmented reality experience for people to interact with the digital creatures. The success of this project led to their latest effort, which involved building an ecosystem and exploring the abstract concept of existing in a relationship. However, due to COVID-19, their exhibition plans were altered.

  • 00:25:00 In this section, the speakers discuss their project on an "artificial biodiversity" and how they turned to coral reefs as an example of the interconnectedness of ecosystems. However, due to a lack of data, they had to work with an artist to create synthetic coral in order to mimic the diversity of coral morphologies. They acknowledge that this is a subjective representation as it's not an accurate reflection of the complex system of a coral reef but it still reminds us of its qualities. They also talk about the fascinating aspect of putting nature on the spotlight through an abstract representation of the patterns of nature and working with biomaterials was a learning challenge.

  • 00:30:00 In this section, the speakers discuss how they made an effort to prioritize sustainability by collaborating with a studio that specializes in creating bioplastic from discarded olive pits. This material can be melted and repurposed again and again, allowing them to create exhibits and then repurpose the material for future projects. They emphasize that it's crucial for artists working with nature to think sustainably and consider the physical consequences of digital layers, especially using machine learning in artistic practice. They also stress the importance of collaboration and interdisciplinary interactions to strengthen connections and create new ones, which led them to having an open call for others to reach out to them for collaborations, conversations, and more. The discussion also touches on philosophy and references Plato and Deleuze and Guattari.

  • 00:35:00 In this section, artists Sofia Crespo and Feileacan McCormick discuss the relationship between tools and art. They explain that just like a pencil shapes the way we draw, digital tools also have shaping qualities. They also touch on the importance of not forgetting the artistic perspective when creating generative and digital art, and why it's necessary to question not only the technical solutions but also the why, how, and what. They state that it is essential to remind ourselves that art is made for humans to consume and that algorithms cannot replace human artists.
MIT 6.S192 - Lecture 11: "Artificial Biodiversity", Sofia Crespo and Feileacan McCormick
MIT 6.S192 - Lecture 11: "Artificial Biodiversity", Sofia Crespo and Feileacan McCormick
  • 2021.01.28
  • www.youtube.com
"Artificial Biodiversity", Sofia Crespo & Entangled Others https://sofiacrespo.com/https://entangledothers.studio/More about the course: http://deepcreativit...
 

MIT 6.S192 - Lecture 12: "AI+Creativity, an Art Nerd's Perspective" by Jason Bailey



MIT 6.S192 - Lecture 12: "AI+Creativity, an Art Nerd's Perspective" by Jason Bailey

Jason Bailey discusses how machine learning is impacting the field of art, from forgery detection to price prediction. He urges artists to be aware of the biases inherent in data-driven art, and urges the need for training data that is inclusive of all perspectives.

  • 00:00:00 Jason Bailey is a lecturer at MIT who will be discussing AI and creativity. He comes from a background of engineering and marketing, and brings this experience to his talk on the intersection of art and technology. Bailey will focus on three key areas: art history, price prediction in the art market, and the use of AI and ML in the creative arts.

  • 00:05:00 Jason Bailey describes how he became interested in the problem of forgery in art, and how he spent three years scanning large format books to create a database of artist's complete works. He talks about how rare and hard to find these catalog resumes are, and how recently someone has reissued a popular version for around $2,000.

  • 00:10:00 Jason Bailey's blog "artnome.com" explores ways to use data to better understand and critique art. In 2017, his blog received attention from 538, which published a story on his project "Ai for Art Scholarship: What Does That Look Like?" After sharing links to his projects and publications in his lecture, Bailey provides a 1-paragraph summary of his talk.

  • 00:15:00 Jason Bailey discusses how machine learning is useful in art history, particularly in analyzing paintings and understanding the history of art. He also talks about his recent project, which involved training a machine learning model to identify iconic paintings by the same artist across different museums.

  • 00:20:00 Jason Bailey's lecture explores the relationships between painting prices and single pixels that make up paintings, as well as trends in the art market. His machine learning platform was able to predict prices for paintings by Spanish painter Pablo Picasso with a correlation of 0.58.

  • 00:25:00 Jason Bailey discusses the current state of machine learning and its impact on the art world. He talks about how machine learning is being used to create more realistic and surreal art, and how this innovation has recently sparked a renewed interest in the field.

  • 00:30:00 Jason Bailey gives a lecture on artificial intelligence and creativity, describing how deep dreaming and style transfer can be used to create art. He talks about his own experiences with these technologies, and how they've not been as exciting to him as they were when he first discovered them. He finishes the lecture by discussing the work of French artist Robbie Barrett.

  • 00:35:00 Jason Bailey gives a lecture on AI and creativity, discussing how traditional art training is insufficient to deal with the present day, when AI and generative art are prevalent. He discusses how his background in art allows him to connect with artists and promoters of generative art, and how his own work has been influenced by these artists.

  • 00:40:00 Jason Bailey discusses how technology and art have intersected in the past, and how data analytics can help artists measure abstraction. He also mentions a project he was involved in where they calculated abstraction in a painter's career.

  • 00:45:00 Jason Bailey explains how his team's algorithm can be used to predict the prices of paintings, based on a number of factors such as the artist's historical popularity, the complexity of the painting, and the material used in the painting. He also notes that the algorithm is still in its early stages, and that more research is needed in order to improve it.

  • 00:50:00 In this lecture, Jason Bailey discusses how he uses auction data to study creativity, as well as how he has incorporated other fields, such as art and nature, into his models.

  • 00:55:00 Jason Bailey discusses the impact of AI on creativity, emphasizing the need for training data that is inclusive of all perspectives. He also discusses the potential consequences of biased AI algorithms. Finally, he urges artists to be aware of the biases inherent in data-driven art.
MIT 6.S192 - Lecture 12: "AI+Creativity, an Art Nerd's Perspective" by Jason Bailey
MIT 6.S192 - Lecture 12: "AI+Creativity, an Art Nerd's Perspective" by Jason Bailey
  • 2021.01.28
  • www.youtube.com
Jason Bailey, Founder at Artnome.comMore about the course: http://deepcreativity.csail.mit.edu/Information about accessibility can be found at https://access...
 

MIT 6.S192 - Lecture 13: "Surfaces, Objects, Procedures: Integrating Learning and Graphics for 3D Scene Understanding" by Jiajun Wu



MIT 6.S192 - Lecture 13: "Surfaces, Objects, Procedures: Integrating Learning and Graphics for 3D Scene Understanding" by Jiajun Wu

Jiajun Wu, an assistant professor at Stanford, discusses his research on scene understanding in machines through the integration of deep learning and domain knowledge from computer graphics. Wu proposes a two-step approach to recover a 3D object geometry from a single image by estimating the visible surface through the depth map and completing the shape based on prior knowledge from a large dataset of other similar shapes. Wu also proposes using spherical maps as a surrogate representation for surfaces in 3D to capture surface features better, allowing the system to complete shapes in a more detailed and smoother output. Additionally, Wu discusses how reconstructing shapes into shape programs can significantly improve modeling and reconstruction, especially for abstract and man-made objects. Finally, Wu discusses how domain knowledge from computer graphics can be integrated with machine learning to improve shape reconstruction, texture synthesis, and scene understanding.

  • 00:00:00 In this section of the video, Jiajun Wu, an assistant professor at Stanford, discusses his research on scene understanding in machines through the integration of deep learning and domain knowledge from computer graphics. By replicating human cognition, his goal is to build machines that have a comprehensive understanding of scenes, including object categories, 3D geometry, physical properties, and future predictions. Wu's research also aims to bridge the gap between machine learning and art by creating a hybrid model that integrates domain knowledge from computer graphics with deep learning. This approach allows for new possibilities in image editing and generation, as well as creativity in the application of deep learning.

  • 00:05:00 In this section of the lecture, Jiajun Wu discusses the problem of recovering a 3D object geometry from a single image, which can be seen as the inverse of the classic problem in computer graphics of generating a 2D image from 3D shape, texture, lighting, material, and viewpoint. While a neural network can be trained to perform the task, Wu suggests that integrating prior knowledge from computer graphics could improve performance, efficiency, and generalizability. He proposes a two-step approach to solving the problem: first, estimating the visible surface through the depth map, and second, completing the shape based on prior knowledge from a large dataset of other similar shapes.

  • 00:10:00 In this section, Jiajun Wu discusses the importance of using depth as an intermediate representation to capture object surfaces and details in shapes. By training a model on the ShapeNet dataset and randomly sampling shapes from it, Wu demonstrates that this approach greatly improves the accuracy of the output. However, he acknowledges that generalizing to objects that the model has never seen before can be a challenge, leading to misinterpretations of the data. To address this, Wu proposes building a differential layer that back projects a 2D representation into a 3D representation, allowing the system to determine a deterministic and fully differentiable process to complete shapes.

  • 00:15:00 In this section, the speaker discusses the limitations of using a partial surface for objects in 3D, specifically that many areas of 3D space are empty, which makes it difficult for the completion network to capture surface features. To address this, the speaker proposes using spherical maps as a surrogate representation for surfaces in 3D, where every pixel corresponds to a point on the surface, and no representation is wasted. The pipeline takes an estimated depth and projects it into a partial spherical map, which can then be completed using a completion network in a spherical map space. This new method results in much smoother and more detailed output, and is generalizable to object categories that were not seen during training.

  • 00:20:00 In this section, Jiajun Wu discusses how intermediate representations and back projection may help to build a better shape reconstruction system that is more generalizable. Using examples of tests on humans and horses, Wu notes that the system is able to reconstruct objects in a relatively reasonable way from a single view, without previously seeing deformable objects, indicating that the system may be used to build better vision systems. Wu also explains how the intermediate representations of surfaces and forward projection may help make rendering better, allowing for the synthesis of new object shapes and textures with more control over the independent factors.

  • 00:25:00 In this section, Jiajun Wu discusses the process of combining previous techniques to extend them to scenes. First, he uses inversion systems to obtain representations of objects' geometry, pose, and textures, including latent representations for non-object-like background segments such as trees or sky. Then, he edits these representations to see how different changes in the scene, such as moving a car closer or changing its texture, affect the overall image. Wu emphasizes the importance of understanding that objects have 3D geometry, as this allows the method to produce complete and accurate results. Finally, he discusses the challenge of shape abstraction when reconstructing man-made objects such as tables, and how incorporating abstract and program-like representations can lead to better results.

  • 00:30:00 In this section, Wu discusses how reconstructing shapes into shape programs can significantly improve modeling and reconstruction, especially for objects such as furniture. Additionally, he explains how procedural structures like replication and symmetry can be leveraged for content creation, such as through an algorithm that can guide 3D projection for building design. To connect raw 2D images and 3D space, Wu's team was inspired by a stochastic search to detect primitives such as lines and triangles in visual data, and are now attempting to synthesize shapes of 3D primitives to guide image sensing.

  • 00:35:00 In this section, Jiajun Wu discusses how internal learning can be used to learn everything from a single image from image statistics, observing that within a single image patches can repeat themselves, and this repetition can occur across scales. By using neuronal activations to identify repeating objects in a single image, the primitives found can be lines, rectangles, spheres, or cylinders, and neural networks can learn features to identify and synthesize programs on top of the centroids of these repeated objects. This can help solve a number of problems, such as image completion or extrapolation, and regularity editing to make scenes more irregular.

  • 00:40:00 In this section, the speaker discusses how to apply their program to 3D images, which is more complex than a single plane. The problem here is to partition the image into multiple planes while considering the orientation and surface levels of each plane. The speaker suggests using visual cues, such as vanishing points and wireframes, to address this. However, wireframe features can be noisy, and there might be multiple possible candidate plane partitions. By using their program's top-down knowledge, they can rectify the candidate planes to 2D images and perform program synthesis to find the correct partition of the image. Doing so can help them find the best joint operation results and image synthesis, which traditional methods cannot accomplish.

  • 00:45:00 In this section, Jiajun Wu discussed how domain knowledge from computer graphics can be integrated with machine learning to improve shape reconstruction, texture synthesis, and scene understanding. Wu emphasized that the understanding of scenes is based on the minimal but universal causal structure behind visual data: objects, surfaces, projections, and occlusions. By integrating learning and machine learning, Wu believes that there is greater potential in creating more enhanced 3D models that go beyond the traditional 2D images. While Wu and his team have not delved into 3D printing, they are interested in 3D shape modeling and the possibility of using inferred procedures behind those models.
 

MIT 6.S192 - Lecture 14: "Towards Creating Endlessly Creative Open-Ended Innovation Engines" by Jeff Clune



MIT 6.S192 - Lecture 14: "Towards Creating Endlessly Creative Open-Ended Innovation Engines" by Jeff Clune

Jeff Clune, a researcher at OpenAI, discusses his work on creating endlessly creative open-ended innovation engines in this MIT lecture. He seeks to create algorithms that can perform the natural evolution and human culture's recipe of starting with a set of things, generating new things, evaluating to keep what is interesting, and modifying it to keep the interesting novelty. Clune explores using neural networks to recognize new things, talk about the Map Elites algorithm, and introduce Compositional Pattern Producing Networks for encoding. He shows how these tools can be combined to generate complex and diverse images, solve hard problems, and create open-ended algorithms that can constantly innovate their solutions to challenges.

  • 00:00:00 In this section, Jeff Clune, an associate professor in computer science at the University of British Columbia and research team leader at OpenAI, discusses his research on creating endlessly creative open-ended innovation engines. He reflects on his personal journey, starting in philosophy and then shifting towards building computational systems to tackle the grand challenge of AI. Clune is interested in creating open-ended algorithms, which endlessly innovate and are seen in nature's endless creations, such as the complex engineering designs of jaguars and hawks.

  • 00:05:00 In this section, the speaker discusses the concept of innovation engines which he defines as the recipe followed by both natural evolution and human culture that allows them to be creative. This recipe involves starting with a set of things, generating something new, evaluating if it is interesting, and retaining and modifying interesting results. The speaker aims to create an algorithm that can perform this process automatically without human intervention in the long run. However, the biggest challenge is to avoid generating uninteresting novelty and generate only interesting novelty. The speaker suggests using neural networks to recognize a large number of classes to recognize new types of things and produce interesting results.

  • 00:10:00 In this section, Jeff Clune discusses an algorithm called Map Elites and its place in the field of algorithmic search. He explains that many hard problems require exploring and discovering new things rather than just optimizing for a goal, and that this should be reflected in algorithms. Clune and his colleagues have been working on a new subfield called Quality Diversity Algorithms, which aim to find a large, diverse set of solutions that are all as good as possible for that type of solution. The algorithm seeks to switch between goals when making progress on another task, believing that this may be the only way to solve really hard problems.

  • 00:15:00 In this section, Jeff Clune, a researcher working on the intersection of biology and artificial intelligence, describes the Map Elites algorithm, which is used to optimize solutions according to some criteria. Clune explains that he and his colleagues applied Map Elites to a robotics problem, generating soft robot morphologies with a genetic algorithm, resulting in a diverse range of creatures. However, the team was not satisfied because they realized that each creature was almost identical, and the algorithm only produced a diversity of designs by starting a new search. To remedy this, Clune applied the Map Elites algorithm to the same problem, this time choosing the number of voxels and the amount of a particular material as dimensions of interest, instead of using the canonical optimization algorithm. He found that the algorithm explored a much wider space of possibilities and ultimately produced much better results. Furthermore, Clune described how the encoding that they use, called the Compositional Pattern-Producing Network (CPPN), is critical in solving the problem they were working on in a later section.

  • 00:20:00 In this section of the lecture, Jeff Clune discusses the encoding choice in deep learning and evolutionary algorithms. In direct encoding, every single feature in the final artifact is represented by a number on the parameter vector, while in generative encoding, information in the parameter vector is reused to produce the final product, resulting in more regular or patterned products. Nature uses generative encoding by using geometric patterns to determine the cell fate, which is the type of cell that each cell becomes, based on the cell's location in the body. This approach is seen as a lingua franca in developmental biology, where pre-existing patterns are combined to create new patterns in the final product.

  • 00:25:00 In this section, Jeff Clune, a researcher from OpenAI, discusses how to efficiently use the power of developmental biology to make open-ended AI systems. He suggests the use of Compositional Pattern Producing Networks (CPPNs), which abstract a lot of the power of natural systems without any of the underlying chemistries, to encode geometric locations as a function of phenotypic elements. By providing coordinates to an artifact to optimize phenotypic elements, such as a neural network or robot morphology, CPPNs can produce arbitrary complexity through the mixing and matching of asymmetric and symmetric and repeating themes. Clune and his team put this idea in three dimensions, building a website called endlessforms.com, where users can pick up each other's evolved shapes to produce a growing archive of stepping stones.

  • 00:30:00 In this section of the lecture, Jeff Clune discusses using CPPNs to automate design and 3D print arbitrary complex images, demonstrating the power of these tools to eliminate technical barriers and generate creative designs easily. He then applies CPPNs to the task of creating an open-ended algorithm and optimizes them to classify each of the thousand bins in ImageNet. Clune describes how the hypothesis of better performance was tested, resulting in images that frequently looked like the category they were associated with or evoked an artistic interpretation of the concept. Despite generating some "fooling images," this generation process allowed the team to explore an entirely new aesthetic space while demonstrating the flaws inherent in deep neural nets that have led to adversarial images.

  • 00:35:00 In this section, Jeff Clune discusses the qualities of the diversity algorithm he and his team developed, which is capable of generating high-quality diverse images. The algorithm produces a diverse set of images, some of which are aesthetically interesting and can be used for practical purposes such as business logos. He also explains how the goal switching capability of the algorithm allows adaptive radiations to occur, similar to what occurs in biology and technological fields. He provides insight into the evolutionary processes taking place within the algorithm, showcasing graphs and phylogenetic trees demonstrating the birth and evolution of innovative ideas. Additionally, he shares that the algorithm and its outputs passed the artistic Turing test, being mistaken for art created by humans rather than AI.

  • 00:40:00 In this section, Jeff Clune introduces the idea of quality diversity (QD) algorithms, which can produce diverse solutions that perform well and have ability to switch goals. He discusses their use in solving challenges, such as robots that can adapt to damage and exploring hard exploration challenges like Montezuma's Revenge and Pitfall. He notes that while QD algorithms have the potential to innovate, they are not yet open-ended and are constrained by the environment. Jeff Clune then proposes the idea of creating open-ended algorithms, such as the Paired Open-Ended Trailblazer (POET) algorithm, which can endlessly generate interesting, complex, and diverse learning environments and their solutions. The POET algorithm is designed to generate new learning environments that are not too easy, nor too hard for the current population of agents, optimizing agents to better solve each of the challenges and allowing goal switching between them.

  • 00:45:00 In this section, Jeff Clune discusses the concept of "goal switching" - the ability for a system to compete in one environment, progress, and then move onto another environment. He shows an RL algorithm traversing terrains that automatically create harder and harder environments. Clune explains that this is a way to measure progress and overcome local optima. He presents the 'poet' algorithm and shows how it is the only way to solve hard problems. He demonstrates that poetry is essential to overcome local optima as seen in a task where a newly optimized robot invades an old environment, replacing the previous incarnation. Clune notes that this type of complex innovation could pave the way for more advanced simulations.

  • 00:50:00 In this section of the lecture, Jeff Clune discusses the potential of combining body optimization with environment generation to create creatures that are optimized for particular environments in the same way that cave dwelling spiders are. He also suggests pairing innovation engines like Dali with algorithms that invent the challenge and solution, then detect what's interestingly new in the images, videos, music, or poetry produced. Clune mentions that his research team has also explored ai neuroscience, a field that studies how much deep neural nets understand about the images they classify. They did this by synthesizing images that maximally activate particular neurons and were able to explore the notion of a five-legged starfish in the network.

  • 00:55:00 In this section of the lecture, Jeff Clune discusses the evolution of deep learning image generation from adding constraints to natural image generation to using deep learning to learn the natural image priors. With slight tweaks to algorithms, wildly different artistic styles are produced from each generator. Neural networks do understand what each object means in a particular space, such as the space of natural images, and can produce images of a higher photorealistic quality. However, little diversity is generated in these natural image spaces. To overcome this problem, plug-and-play generative networks were introduced which produce a much wider range of diverse images than previously seen in deep learning.

  • 01:00:00 In this section of the lecture, Jeff Clune discusses the progress made in AI neuroscience and the creation of open-ended creative processes. He highlights how AI can recognize and learn about concepts in our world, such as volcanoes or a lawnmower, but is susceptible to producing and recognizing adversarial images. Clune recommends the work of Chris Ola and talks about his team's work in exploring different modes, such as speech and video. He also shares his excitement over the progress made and future potential in the field, including generating synthetic images that activate neurons within a real monkey brain. Clune suggests that science often produces aesthetic artifacts and how modern tools of machine learning allow for the merging of art and science. Finally, he recommends reading the works of Ken Stanley and Joel Lehman for students interested in joining the mission of creating endlessly creative open-ended processes.

  • 01:05:00 In this section, Jeff Clune explains that open-ended algorithms have the potential to support advancements in artificial general intelligence. He recommends reading his AI Generating Algorithms paper, which explores how these algorithms might be the path to produce general AI. Jeff also encourages researchers to apply these ideas in various domains and to use tools like GPT-3 or Dolly to do so. He suggests that exploring low-hanging fruit in different areas, such as poetry or architecture, might lead to exciting advancements. Jeff also addresses Joseph's question regarding using the Poet algorithm in a multi-agent setting, and discusses the challenges that arise, such as difficulty in measuring agent performance in such an environment.
 

MIT 6.S192 - Lecture 15: "Creative-Networks" by Joel Simon



MIT 6.S192 - Lecture 15: "Creative-Networks" by Joel Simon

In this lecture, Joel Simon explores his inspirations and approaches towards creative networks that draw from natural ecosystems. He demonstrates the potential of computational abilities in the creative process, describing how techniques such as topology optimization, morphogens, and evolutionary algorithms can enable the emergence of incredible forms and textures. Simon also shares details about his GANBreeder project, an online tool for discovering and mutating images using a CPPN and a GAN, and discusses the potential of cross-recommendation systems in the creative process. Simon is optimistic about the future of technology and creativity, believing that humans can collaborate and optimize the functions of buildings and create something greater.

  • 00:00:00 In this section, Joel Simon explains his background and inspirations towards his creative network work. He highlights Brian Eno's critique on the notion of lone geniuses and describes how creativity can be quantified as an emergent product of various forces working together. Simon also talks about his journey towards sculpting, which led him to learn and explore computational ways of creating, emphasizing the difference between being digital and being computational.

  • 00:05:00 In this section, Joel Simon describes his inspiration for his work in computational design and topology optimization, which he discovered during his college years. Fascinated by the capacity for topology optimization to produce new forms that could never have been created in a traditional sense, Simon sought to explore its potential further. However, he realized he needed to move beyond simple optimization techniques and incorporate elements of real nature, such as adaptivity and environment, that could enable a building to grow like a tree, leading him to conduct experiments on generative architecture. His work was not just grounded in architectural design but also used methods of graph simulation and evolved virtual creatures as inspiration for increased complexity and innovation in computational design.

  • 00:10:00 In this section, the speaker discusses the use of pattern information and morphogens in the growth process, specifically in regards to reaction diffusion. He explains that these patterns can be used in art to produce texture and discusses Jeff's CPPNs, which are used to map a simple network from position to color and convert it into an image. To take these growth ideas further, the speaker created the project "Evolving Alien Corals," which uses morphogens across vertices of a 3D mesh to control the direction the vertices move and emit. This allowed for compounding effects that gave rise to incredible forms. The colors of the corals are the morphogens being optimized and not just generating pretty patterns. This project also shows the idea of being able to sculpt with forces or objectives to drive forms, where form follows fitness function. The speaker also briefly touches on the idea of ecosystems and the intermediate disturbance hypothesis, where the optimal diversity is reached with an amount of disturbance in the middle.

  • 00:15:00 In this section, Joel Simon discusses his fascination with creative networks that draw from natural ecosystems, and explores how these landscapes are conducive to sculpting and manipulating patterns. He poses the question of what it would be like to see ecological collapse or how disturbances such as invasive species or merging different islands together would affect the ecosystem. Simon was inspired by cuneiform and the idea of calligraphy as a solution for a multi-objective problem. To experiment with different methods, Simon created a custom neural architecture that generated pattern recognition for communication through a noisy medium, with each form being recognizable and mutually distinctive, which resulted in the emergence of different languages. Later on, he modified this system to be both cooperative and adversarial, producing unique calligraphy sets that resemble each other but remain functional in a different way.

  • 00:20:00 In this section, Joel Simon discusses some of his generative art projects that were inspired by various sources such as Matisse's self-portraits and the Game of Life by Conways. He created portraits using genetic algorithms and explored the concept of a generative architecture for artificial life. Simon also talks about how he was inspired by the pick breeder project, which involved using a neural network to generate images of creatures that are then selectively bred to create new and interesting designs.

  • 00:25:00 In this section, the speaker discusses his inspiration for creating GANBreeder, an online tool for discovering and mutating images using a CPPN and a GAN. He was inspired by the idea that greatness cannot be planned and intrigued by the innate sense of interest in humans that could help augment the algorithms used in this tool. He delves deeper into GANs and recognizes that latent vectors of GANs have the necessary property to be used for crossover, which allows for images of children to resemble both parents. The speaker talks about the different types of creativity and states that his tool was a combinatorial thing where he combined BigGAN with Picbreeder to create GANBreeder. He also discusses the three ways in which GANBreeder allows users to create images, namely, getting random children, mixing two images together, and editing the genes of an image.

  • 00:30:00 In this section of the lecture, Joel Simon discusses the creative process in terms of exploratory phases which range from being open-ended to intentional with a gradient in between. Biological parallels are mentioned, such as asexual reproduction, sexual reproduction, and crispr, as different ways to create and make images. Simon then provides an example of an image he made, along with the genes that make it up, emphasizing the importance of interactive, collaborative exploration, as humans cannot think in 128 dimensions. Simon concludes with the idea that ArtBreeder can be used as a tool for finding ideas and inspiration, and mentions a recent feature that allows users to create their own genes, relevant to those interested in machine learning.

  • 00:35:00 In this section, Simon describes how his project, Ganbreeder, takes advantage of the crowd source ecosystem of tagging images. By collecting samples of a subtle property in images, users can turn it into a tool or filter that allows for creating more powerful genes. The project started as a simple grid of images with a prompt of which image is most interesting. However, users have been using Ganbreeder in unexpected ways, such as uploading photos to colorize historical figures, making dresses, or even painting over characters. Simon emphasizes that the experiment was actually the interface, not the gan, as the two really had to go together in order to make it work.

  • 00:40:00 In this section of the video, Joel Simon discusses the potential power of creating a cross-recommendation system tool that utilizes latent dimensions of variation not currently used in existing recommendation engines. He uses the example of not being able to key on whether or not lyrics are present in songs when he is working, suggesting that if recommendation engines could help users like him create a tool that considers these dimensions of variation, they could make much stronger recommendations. Simon also explores the idea of ownership and collaboration in creative tools, describing an interactive art show he curated where no one "owned" the art because it had been collaboratively created by many people.

  • 00:45:00 In this section, Joel Simon discusses the limitations of human thinking in contrast with the potential of computational abilities in the creative process. Humans have certain biases to our thinking, including thinking in clear hierarchies, having routines, and not thinking in complex overlaps. Simon discusses how facilitating collaboration, exploration, allowing new mediums, and metaphors can lead to new creative processes. Dialogues between a creative director and an artist are essential in this process, with the director guiding the artist's creativity. Simon is optimistic about the future of computation and creativity and believes that it will be person-driven in using the tool to make new artwork that we share with other people rather than being a replacement for artists and creatives.

  • 00:50:00 In this section, Joel Simon discusses creativity and the misconception that technological advancements will replace artists. He believes that such advancements only make creative expression more accessible for everyone, and states that creativity is an innate human need and an end in itself. Simon ends by proposing a morphogenic design concept that adapts the natural process of breeding and uses biomimicry to create collaborative processes for designing beyond human cognitive abilities. He emphasizes that humans are part of a larger creative connective tissue and inspiration for projects is gathered from this larger system.

  • 00:55:00 In this section, Joel Simon talks about his optimistic view of the future of technology in building an ecosystem of buildings that are mutually harmonious together as a complex ecosystem. He believes that with new metaphors and techniques, people can collaborate and optimize the functions of these buildings in ways that are beyond comprehension. While technology has its pros and cons, Simon's positive outlook on the dialogue between machines and humans provides insight into a future where technology can bring people together to create something greater.
MIT 6.S192 - Lecture 15: "Creative-Networks" by Joel Simon
MIT 6.S192 - Lecture 15: "Creative-Networks" by Joel Simon
  • 2021.01.30
  • www.youtube.com
Joel Simon is an artist, researcher and toolmaker inspired by the systems of biology and creativityhttps://www.joelsimon.net/More about the course: http://de...
 

MIT 6.S192 - Lecture 16: "Human Visual Perception of Art as Computation" Aaron Hertzmann



MIT 6.S192 - Lec. 16: "Human Visual Perception of Art as Computation" Aaron Hertzmann

The lecture explores perceptual ambiguity and indeterminacy in art and the use of generative adversarial networks (GANs) in creating ambiguous images. It discusses the impact of viewing duration on perception and the relationship between image entropy and human preferences. The lecturer suggests an evolutionary theory of art, where art is created by agents capable of social relationships. The use of AI in art is also discussed, with the conclusion that while algorithms can be useful tools, they cannot replace human artists. The lecture concludes with a few remarks on concepts such as value.

  • 00:00:00 In this section, the speaker discusses perceptual ambiguity and indeterminacy, which are important themes in modern art. He explains that images with various interpretations can cause viewing duration to change and can flip back and forth between different perceptions, affecting the choices individuals make. Visual indeterminacy is a term used to describe images that seem to yield a simple coherent interpretation but fail to resolve in a coherent shape, and this theme became popular in the modern era, especially with cubism. Psychology literature has discussed and studied perceptual ambiguity and ways to describe this space of ambiguity, but there has been difficulty in finding comparable stimuli and measuring ambiguity until the emergence of generative adversarials in recent years.

  • 00:05:00 In this section, the speaker discusses the use of GANs in creating art and the natural visual ambiguity these types of images can exhibit. The team used these images in a study where participants were shown an image for a short period of time and asked to describe it. The results demonstrated that images with higher levels of perceptual uncertainty and ambiguity resulted in a greater range of descriptions from participants. Additionally, the duration of the viewing period had an impact on the number and variety of words used to describe an image, with participants converging to more coherent interpretations with longer exposure.

  • 00:10:00 In this section, the lecturer discusses the relationship between image entropy and human preferences for ambiguous images. The team found that there are two categories of users, with one preferring low-entropy images and another preferring high-entropy ones. However, clustering users into these categories has only been successful in predicting preferences for certain types of images and requires more natural language processing to extract the right information. Moving on, the definition of art and whether computers can create art are explored. The current definition of art is found to be inadequate as it doesn't generalize to consider new art forms, like those that may be created by aliens. Instead, the speaker suggests an evolutionary theory of art, whereby art is created by agents capable of social relationships, and as such, social activity. This leads to the conclusion that computers can be artists, but this dialogue is misguided as it may give non-experts the wrong understanding.

  • 00:15:00 In this section, the speaker discusses the use of ideas from computation to understand human perception of art and how art is made. He argues that computers cannot be artists until they possess personhood or a social relationship. However, computers are powerful tools for artistic creativity and provide new tools for artistic creation. The speaker also refutes the idea that AI art will lose its value as it becomes more accessible, pointing out that the best AI artists are experimenting with coding and carefully selecting results.

  • 00:20:00 In this section, Hertzmann discusses the use of artificial intelligence (AI) in art and questions whether machines that can generate art based on human preferences can be considered artists. He argues that current AI algorithms are simply following instructions and do not possess the creativity of a human artist. However, he is excited about the potential for algorithms to model the artistic process and preferences, allowing them to be useful tools in creating and curating art. Ultimately, Hertzmann does not believe that algorithms can replace human artists, as art is a product of culture and time.

  • 00:25:00 In this section, a few concluding remarks are made after a discussion on concepts such as value. No significant information is provided on these concepts or any new topics of discussion. The speaker is thanked for an enlightening and inspiring talk.
MIT 6.S192 - Lec. 16: "Human Visual Perception of Art ..." Aaron Hertzmann (see comments for part I)
MIT 6.S192 - Lec. 16: "Human Visual Perception of Art ..." Aaron Hertzmann (see comments for part I)
  • 2021.02.01
  • www.youtube.com
Human Visual Perception of Art as Computation, Part IIAaron HertzmannPrincipal Scientist, Adobehttps://research.adobe.com/person/aaron-hertzmann/Note we only...
 

MIT 6.S192 - Lecture 17: "Using A.I. in the service of graphic design" by Zoya Bylinskii



MIT 6.S192 - Lecture 17: "Using A.I. in the service of graphic design" by Zoya Bylinskii

Zoya Bylinskii, a research scientist at Adobe, explores the intersection of graphic design and artificial intelligence (AI) in this lecture. Bylinskii emphasizes that AI is meant to assist rather than replace designers by automating tedious tasks and generating design variations. Bylinskii gives examples of AI-assisted tools, including interactive design tools and AI-generated icon ideation. Bylinskii also discusses the challenges and potential in applying AI to graphic design, including the need for creative thinking, curation, and working with professionals from different fields. She advises candidates interested in AI and machine learning for graphic design to showcase project experience and pursue research opportunities.

  • 00:00:00 In this section, Zoya Bylinskii, a research scientist at Adobe, explains how AI can be used in the service of graphic design. Bylinskii talks about the intersection of graphic design and AI and how diverse stylistic forms of graphic designs can be deconstructed into computational modules that can be learned from and automated. She stresses that AI is not meant to replace designers but rather to enable designers with automation for tedious tasks and rapid exploration to generate design variants automatically while keeping the designer central to the design process and curation. Bylinskii gives two examples of these goals: resizing and laying out a design for different form factors and aspect ratios, and cycling through many possible visual representations when creating an icon, logo, or a similar design asset.

  • 00:05:00 In this section, Zoya Bylinskii discusses how design automation can increase the velocity of the design process by minimizing tedium and facilitating a more efficient iteration process. Bylinskii goes on to explain how machine learning can predict visual importance in design, creating more effective guidance for graphic designers by learning what is visually striking and attention-grabbing in different designs. By utilizing an annotation tool, Bylinskii and her colleagues curated a dataset of a thousand image-annotation pairs to train their model on this concept of importance, which used classification modules to predict the most salient regions of a design at test time, guiding designers on where to place other design elements.

  • 00:10:00 In this section, Zoya Bylinskii discusses two applications for using artificial intelligence (AI) in graphic design. The first application involves an interactive design tool that uses a small neural network to continuously recompute the predicted importance of various design elements in real-time. The tool also features a histogram and allows users to adjust the importance level of each element to manipulate the design. The second application involves icon generation ideation, where AI is used to create new icons that correspond to common visual concepts. Bylinskii explains that both of these applications offer promising new directions for using importance models in AI-assisted graphic design tools.

  • 00:15:00 In this section, the speaker explains the challenge that designers face when they try to create new iconography for a concept that does not have existing icons, such as sushi delivery. This process requires manual work, searches for related concepts for inspiration, as well as recombining and editing existing icons. To simplify this process, the speaker introduces a new AI-driven pipeline for compound icon generation. This system combines space, style, and semantics to generate compound icons that are stylistically compatible and semantically relevant to the queried concept. The AI-driven pipeline involves breaking down the query into related words, finding stylistically compatible icons, and combining them to convey the desired message.

  • 00:20:00 In this section, Bylinskii discusses a project called Iconate, which uses AI to suggest compatible icon combinations and layouts for creating new designs. The system learns an embedding space to suggest stylistically compatible icons and a template-based approach to define the layout for the constituent icons. Iconate was trained using the CompyCon1k dataset of 1,000 compound icons with annotated individual components. Bylinskii explains that the system allows users to create compound icons much faster than with stand-alone design tools, and it could be used to quickly generate icons for any concept a user can think of. She also highlights other AI-powered design tools, such as logo synthesis and layout refinement systems, that aim to facilitate the design process rather than replacing humans' creativity.

  • 00:25:00 In this section, the speaker discusses the use of AI in creating infographics, including text, statistics, and small visualizations. She also notes that this work is spread across different communities and conferences, and provides examples from computer vision, such as generating GUI designs using GANs. She notes that there are many resources available, including data sets for computational graphic design and creativity, and briefly mentions the Behance Artistic Media Data Set and the Automatic Understanding of Image and Video Advertisements Data Set.

  • 00:30:00 In this section, the speaker discusses the available models and tools for automating components within the design workflow, noting that many of the automatic tools are not very creative, but there is still a lot of potential for future discovery in the space of automated yet highly creative workflows. She encourages students to explore this space themselves and generate interdisciplinary thoughts, which can lead to exciting applications at the interface of computation and design. The discussion also touches on the limitations of current text-to-visual models in graphic design and the potential for new models that can generate vector graphics.

  • 00:35:00 In this section, the speaker discusses a project where the goal was to produce a caption from a given infographic in order to search through infographics on the web and annotate them for the visually impaired. However, they encountered a problem as they couldn't use existing object detectors to extract visuals and icons from infographics. This led to the development of a way to train an icon detector using synthetic data, which eventually enabled the detection of icons. The students later explored the possibility of learning joint embeddings between the icons and the text nearby, which could be used to understand how abstract concepts were visualized in complex graphic designs. The speaker emphasizes that AI is not meant to replace designers but to help them, and that curation will remain an important aspect of the job.

  • 00:40:00 In this section, the speaker discusses the role of designers in the realm of AI-generated graphic design. While training models to generate designs is possible, it is difficult to train them to create entirely novel designs. Therefore, designers can introduce new assets and components that are beyond the current manifold, which can then be used to automatically manipulate and generate new designs. The speaker also emphasizes the need for curation, as designers can help identify garbage and non-garbage pairs to improve the training process. Furthermore, the speaker notes that adapting designs to different cultures is still a challenge due to the lack of sufficient data. Finally, the speaker explains the role of research scientists in companies like Adobe, who aim to pitch big research ideas that can be incorporated into existing product teams for further development.

  • 00:45:00 In this section, Zoya Bylinskii discusses the challenges of applying AI in graphic design to create practical products. She highlights the need to conceptualize problems in a way that makes them portable to different tech products, pitching research ideas to companies, and working alongside professionals from different fields for expertise. Bylisnkii advises students and interns to develop a strong computational toolset to improve their chances of landing a position as an engineering, research, or product intern.

  • 00:50:00 In this section, the speaker focuses on the skills they're looking for in a candidate interested in AI and machine learning for graphic design. They stress the need for proficiency in software tools and machine learning. They recommend showcasing experience not just in course form but in project form with examples on Github. They suggest that candidates need to display creativity and innovation, going beyond existing models and libraries to conceptualize new ideas and apply them in new ways. Candidates should pursue research experience or tech positions in a university lab. They recommend approaching professors and offering to work for a specific period on certain problems. Finally, they emphasize the importance of references from other researchers, attesting to the candidate's creativity, technical strength, and suitability for research.
MIT 6.S192 - Lecture 17: "Using A.I. in the service of graphic design" by Zoya Bylinskii
MIT 6.S192 - Lecture 17: "Using A.I. in the service of graphic design" by Zoya Bylinskii
  • 2021.01.30
  • www.youtube.com
Dr. Zoya BylinskiiResearch Scientist, Creative Intelligence Lab, Adobehttps://research.adobe.com/person/zoya-bylinskii/More about the course: http://deepcrea...