I’m fascinated by how much this is exactly like working with a human artist who doesn’t really understand the domain that you are wanting to represent with an image. Iterate, iterate, iterate.
It seems like the most valuable thing this could do is get some of that early exploration out of the way faster and easier than a human can do it, get to two or three concepts that feel like they’re in the neighborhood, and then let a human expert take over and turn it to something final quality. That’s pretty cool.
At the end of the article I also described a bit how I would see the evolution of such a tool, and it looks like we're thinking very similarly.
---
Though I think the real breakthrough will come when Dall-e gets 10-100x cheaper (and faster). I would then envision the following process of working with it (which is really just an optimization on top of what I’ve been doing now):
1. You write a phrase.
2. You are shown a hundred pictures for that phrase, preferably from very different regions of the latent space.
3. You select the ones best matching what you want.
4. Go back to 2, 4-5 times, getting better results every time.
5. Now you can write a phrase for what you would like to change (edit) and the original image would be used as the baseline. Go back to 2 until happy.
I see this happening in all areas. Everything would be prompt-driven.
Do you like this? What about this? You simply nod or reject the solutions that you don't want.
Pretty soon somebody's expertise and experience is not going to be enough to continue paying them what they used to get before this magic blackbox appeared.
One day enterprises will realize they can just outsource that expert who's been reduced to simply typing prompts and nodding yes or no.
I am worried that the middle class is rapidly disappearing. We will own nothing and be happy seems quite ominous. The question is then what field is safe from advancements in AI?
The only field I can think of is doctors, lawyers, executives, buy-side money managers. Even their jobs will be partially automated but it will be safe as long as they generate revenue.
You don’t need nodding or really any conscious reaction I think. It should be possible to have some camera directed at face hooked up to another AI that catches slight changes in pupil dilation or other changes imperceptible to naked eye and registers when something looks interesting to the user. You can then quickly show a stream of variations and pick the tagged ones and use them to improve the guesses. I imagine something like this might one day become a preferred way of interacting with computers/AI.
But, if everyone's jobs are automated, nobody is making any money, so nobody has any money to pay doctors, lawyers, executives, money managers, etc. You would think that if these types were thinking rationally, they would be fighting to expand the middle class so more people can pay for their services.
In the past, eliminating humans from one set of jobs has been balanced by a new set of opportunities for humans in different jobs. Usually, the new jobs are more valuable.
That's not utopianism. The new jobs can't always be filled by the people kicked out of jobs. It really sucks to be them.
But it does mean that it's not irrational for people to want to automate other people's jobs. The net amount of stuff generated increases, rather than decreases.
This pattern may not last forever. There's already some thought that we've generated more than enough stuff to guarantee a decent standard of living to everybody (at least in the developed world) without working, and plenty more for luxuries if people choose to work. Even if we haven't reached it, we appear to be heading in that direction sooner rather than later.
That may cause a radical re-think at some point. And it won't be seriously delayed by making sure cartoonists have jobs.
Jobs are plentiful as long as wealth is well distributed.
In the past, fast automation has led to badly distributed wealth, and job loss. This situation has lasted until the unemployable people died off (yep, that was part of it), and enough wealth was redistributed through violent means.
Today we know better, and have really no reason to repeat the violent means of our previous revolutions. But it's really looking like the people in power want to repeat them.
> enough wealth was redistributed through violent means.
there were no instances of violent redistribution of wealth that ended better for the average person than before. Only that a different group of people ended up with wealth.
Automation makes stuff cheaper, even for people who didn't obtain any of the financial wealth via redistribution - because there's more than just financial wealth that get created with automation. New availability of services and goods (think internet of today - this is a wealth that couldn't have existed before, and one can benefit from it even if they are poor today).
> enough stuff to guarantee a decent standard of living to everybody
It's not a zero sum game. There's still growth in us. We'll go to space and expand 1000x more, the space has plenty of resources, and humans will have jobs working together with AI.
We'll have to automate childcare to make that happen. Otherwise, the birthrates of the rest of the world will follow the countries with the highest standards of living on a wild plunge into unsustainability.
Just like in Star Trek. They really knew what the end goal was didn't they.
> enterprises will realize they can just outsource that expert who's been reduced to simply typing prompts and nodding yes or no
Tbf a program averaging the market for a fact gives better returns than most of the financial industry, yet they still exist. Even if we can automate something doesn't mean we will, usually for pointless emotional reasons.
But on the other hand it's hard to say if in a 100 years humans will still be employable in any practical capacity for literally anything.
>Pretty soon somebody's expertise and experience is not going to be enough to continue paying them what they used to get before this magic blackbox appeared.
I doubt it, because the process of thinking of phrases to feed dall-e is really the hard bit.
This is ok for a logo like this where it’s fair to say the base level expectation is not super creative. This logo is cool, but it doesn’t really stand out or make the product ver distinctive. If I am running a hobby or OS project that’s fine, but if I was investing a lot in sales/marketing then paying a real artist to make something interesting and novel is a rounding error.
> This logo is cool, but it doesn’t really stand out or make the product ver distinctive. If I am running a hobby or OS project that’s fine, but if I was investing a lot in sales/marketing then paying a real artist to make something interesting and novel is a rounding error.
Q: Are there really logos out there that are "interesting and novel" and that "stand out or make the product [..] distinctive"? Which ones?
EDIT: (perhaps more importantly) are there interesting, novel, distinctive logos that actually contribute to profitability?
tbf I think when it comes to big company branding it's the opposite.
A lot of GPT iterations of the design has left the article author with something which is quirkier than your average logo, but also looks like clipart and probably doesn't scale up or down well or work in monochrome. Which is fine for OSS. (He might get more users from blog traffic about using GPT-3 to design his logo than he ever could from any other logo anyway)
But when it comes to bigger companies, the design agency are the people that sit in meetings with execs persuading them that a well chosen font and a silhouette of a much simplified octopus will work much better ("but maybe the arms could interact with some of the letters etc etc, now lets discuss colours). The actual technical bit of drawing it is the bit that's already relatively cheaply and easily outsourced, and plenty of corporate logos are wordmarks that don't even need to be drawn...
Doctors are very vulnerable. Most of dermatology is simple pattern recognition. I can easily see AI lawyers beating human lawyers in litigation, too. An AI lawyer will have read every single case and know the outcomes, and can fine tune arguments for specific parameters like which judge etc.
> Most of dermatology is simple pattern recognition.
I have a few qualms with this app:
1. For a Linux user, you can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem. From Windows or Mac, this FTP account could be accessed through built-in software.
2. It doesn't actually replace a USB drive. Most people I know e-mail files to themselves or host them somewhere online to be able to perform presentations, but they still carry a USB drive in case there are connectivity problems. This does not solve the connectivity issue.
3. It does not seem very "viral" or income-generating. I know this is premature at this point, but without charging users for the service, is it reasonable to expect to make money off of this?
This workflow reminds me of a generative art program from the early 1990s, but I just can't remember its name. It was a DOS or Windows program that had a very curvy, fluid GUI with different graphics sliders. It would show you some random tiles and you choose one to guide the algorithm's next generation of tiles.
I wonder if Kai Krause lurks here at HN. I'd love to know how he's doing. Apparently he's still living in his castle, which he bought around 1999 [0].
Some-when in the 00's I read an article about him that he was putting advanced networking stuff into the castle and had the intention to start something like a "think-tank" (doesn't really fit it, but I don't know what I'd call it) where he and others would hang around and code stuff.
I found the article [1] from July 2002, "Lord of the Castle Kai Krause presents Byteburg II".
> So that 's Kai Krause's long-cherished plan: Now the software guru has finally opened a center for founders and developers from the IT and software industry in Hemmersbach Castle near Cologne -- the Byteburg II
I really wonder what he's doing to these days. His plug-ins were legendary, as well as the User Interface for Bryce [2]
Your comment really intrigued me to google this interesting person I had never heard about before. This may well not be used to you, but Kai has a not-a-blog blog that I stumbled upon on here http://kai.sub.blue/en/sizemo.html.
Some really interesting reads. I especially appreciated his articles on the passing of Douglass Adams (apparently a close friend of his!) and Then vs Zen.
Love him or hate him (and I do both), Kai was all about cultivating his adulating cult of personality and dazzling everyone with his totally unique breathtakingly beautiful bespoke UIs! How can you possibly begrudge him and his fans of that simple pleasure? ;)
In the modest liner notes of one of the KPT CDROMS, Kai wrote a charming rambling story about how he was once passing through airport security, and the guard immediately recognized him as the User Interface Rock Star that he was: the guy who made Kai Power Tools and Power Goo and Bryce!
Kai's Power Goo - Classic '90s Funware! [LGR Retrospective]:
>Revisiting the mid 1990s to explore the world of gooey image manipulation from MetaTools! Kai Krause worked on some fantastically influential user interfaces too, so let's dive into all of it.
>"Now if you're like me, you must be thinking, ok, this is all well and good, sure, but who the heck is Kai? His name's on everything, so he must be special. OH HE IS! Say hello to Kai Krause. Embrace his gaze! He is an absolute legend in certain circles, not just for his software contributions, but his overall life story." [...]
>"... and now owns and resides in the 1000 year old tower near Rieneck Castle in Germany that he calls Byteburg. Oh, and along the way, he found time to work on software milestones like Poser, Bryce, Kai's Power Tools, and Kai's Super Goo, propagating what he called "Padded Cell" graphical interface design. "The interface is also, I call it the 'Padded Cell'. You just can't hurt yourself." -Kai
But all in all, it's a good thing for humanity that Kai said "Nein!" to Apple's offer to help them redesign their UI:
>read me first, Simon Jary, editor-in-chief, MacWorld, February 2000, page 5:
>When graphics guru Kai Krause was in his heyday, he once revealed to me that Apple had asked him to help redesign the Mac's interface. It was one of old Apple's very few pieces of good luck that Kai said "nein"
>At the time, Kai was king of the weird interface - Bryce, KPT and Goo were all decidedly odd, leaving users with lumps of spherical rock to swivel, and glowing orbs to fiddle with just to save a simple file. Kai's interface were fun, in a Crystal Maze kind of way. He did show me one possible interface, where the desktop metaphor was adapted to have more sophisticated layers - basically, it was the standard desktop but with no filing cabinet and all your folders and documents strewn over your screen as if you'd just turned on a fan to full blast and aimed it at your neatly stacked paperwork.
>Bruce “Tog” Tognazzini writes about Kansei Engineering:
>»Since the year A.D. 618 the Japanese have been creating beautiful Zen gardens, environments of harmony designed to instill in their users a sense of serenity and peace. […] Every rock and tree is thoughtfully placed in patterns that are at once random and yet teeming with order. Rocks are not just strewn about; they are carefully arranged in odd-numbered groupings and sunk into the ground to give the illusion of age and stability. Waterfalls are not simply lined with interesting rocks; they are tuned to create just the right burble and plop. […]
>Kansei speakes to a totality of experience: colors, sounds, shapes, tactile sensations, and kinesthesia, as well as the personality and consistency of interactions.« [Tog96, pp. 171]
>Then Tog comes to software design:
>»Where does kansei start? Not with the hardware. Not with the software either. Kansei starts with attitude, as does quality. The original Xerox Star team had it. So did the Lisa team, and the Mac team after. All were dedicated to building a single, tightly integrated environment – a totality of experience. […]
>KPT Convolver […] is a marvelous example of kansei design. It replaces the extensive lineup of filters that graphic designers traditionally grapple with when using such tools as Photoshop with a simple, integrated, harmonious environment.
>In the past, designers have followed a process of picturing their desired end result in their mind, then applying a series of filters sequentially, without benefit of undo beyond the last-applied filter. Convolver lets users play, trying any combination of filters at will, either on their own or with the computer’s aid and advice. […] Both time and space lie at the user’s complete control.« [Tog96, pp. 174]
>Anyone who has been using Macs for at least the last ten years will surely remember Viewpoint Corporation’s products. No? Well, Viewpoint Corporation was previously MetaCreations. Still doesn’t ring a bell? Maybe MetaTools will. Or the name Kai Krause. Or, even better, the names of the software products themselves — Kai’s Power Tools, Kai’s Power Goo, Kai’s Photo Soap, Bryce, Painter, Poser… See? Now we’re talking.
>Experienced 3D professionals will appreciate the powerful controls that are included, such as surface contour definition, bumpiness, translucency, reflectivity, color, humidity, cloud attributes, alpha channels, texture generation and more.
>KPT Bryce features easy point-and-click commands and an incredible user interface that includes the Sky & Fog Palette, which governs Bryce's virtual environment; the Create Palette, which contains all the objects needed to create grounds, seas and mountains; an Edit Palette, where users select and edit all the objects created; and the Render Palette, which has all the controls specific to rendering, such as setting the size and resolutions for the final image.
>He intends to challenge everything you thought you knew about the way you use computers. 'I maintain that everything we now have will be thrown away. Every piece of software -- including my own -- will be complete and utter junk. Our children will laugh about us -- they'll be rolling on the floor in hysterics, pointing at these dinosaurs that we are using.
>'Design is a very tricky thing. You don't jump from the Model T Fort straight to the latest Mercedes -- there's a million tiny things that have to be changed. And I'm not trying to come up with lots of little ideas where afterwards you go, "Yeah, of course! It's obvious!"
>'Here's an easy one. For years we had eight character file-names on computers. Now that we have more characters, it seems ludicrous, am historical accident that it ever happened.
>'What people don't realize is that we have hundreds more ideas that are equally stupid, buried throughout the structure of software design -- from the interface to the deeper levels of how it works inside.'
The model starts from 64x64 8bit RGB image of noise (random pixels) so technically 1 in 3_145_728 (64x64x256x3) but most will probably be very close to each other as the color difference won't be that much. The image is then further upsampled by two other models which will change some details, but shouldn't affect the general composition of an image.
Maybe I'm wrong, but with these diffusion models there is randomness in every sampling step too not just in the initialization and they can have 1000 steps to generate a single image.
Ah good point, this would introduce more variation if the initial noise is close, but if the initial noise is exactly the same it probably means it was initialized with the same seed and the rest of the generation will be the same since the random algorithms are deterministic.
Yeah, my first thought was "Ok, but you are going to need to involve a graphical artist to actually really make use of that logo". Like you probably want a vector version and you definitely need simplified versions for smaller sizes but then I stopped and realized how amazing this actually is. It "saved" (I know, it cost $30 but that's a steal for something like this) all the time and money you would have paid for iteration after iteration and let the author quickly hone in on what they wanted.
As someone who is incredibly terrible at graphic design but knows what they like this could be a game changer as iterations of this technology progress. I can imagine going further than images and having AI/ML generate full HTML layouts in this iterative way where you start to define your vision for a website or app even and it spits out ideas/concepts that you can "lock" parts of it you like and let it regenerate the rest.
I'm not downplaying designers role at all, I'd still go to one of them for the final design but to be able to wireframe using words/phrases and take a good idea of what I want would be amazing, especially for freelance/side-projects.
Honestly though the hard part is the actual design which is already done here. Learning to vectorize a raster is something that can be done in a weekend with Inkscape, there's no reason to involve an actual graphics designer with this anymore.
> Learning to vectorize a raster is something that can be done in a weekend with Inkscape, there's no reason to involve an actual graphics designer with this anymore.
If you lined up 100 resulting images, 99 from weekend beginners and 1 from an actual artist. I guarantee you that you would pick out the artist every time.
It might be simple to trace over an image but you are probably better getting an artist to spend 2 hours on it, it will most likely look better than 2 weeks of tracing.
Time value of money. The most optimal use of money and time would be getting the ML to iterate until you have the finished product, then get a designer to vectorise it and fix it up. That way you pay the designer for one iteration and spend all the time you would have spent iterating with the designer iterating with the ML model instead.
I think you might be underestimating how much work goes into the last mile of a design. A lot of refinement work goes into typography in particular, a domain Dall-E isn’t yet proficient in at all.
I think your art/design/craft is pretty good. Some people use pencils, some use Adobe products, you have gone out there and tried the new Dall-E medium.
Glad you thought out the usage, I am sure that when the novelty wears off that you will have that neat-as-octocat logo sorted out.
I appreciate that you appreciate the value that highly skilled designers bring to a product with their visual expertise.
However, I would like to see you A/B test the Dall E logo versus the winning designer logo. You could show odd IP addresses one logo and even addresses the other.
I think the designer would edge the robot for what you need (a logo), however, the proof is in the pudding and conversion rate.
Plus there is no reason why someone couldn't build a specialised AI model to do vectorisation and another to generate simplified versions of vectors.
People are already doing by combining DALL-E 2 with gfpgan for face restoration. So there may be a role in understanding how to combine these tools effectively.
Yes! It gives powerful tools for someone with a concept to get much closer to visualization of their idea.
DALL E 2 is like a low or no-code tool in that way.
The outcome may not be a "finished" product, especially as viewed by a professional designer (or web dev). However, its a heck of a lot better than a tersely written spec.
And in some cases, the product will work well enough to unblock the business, get customer feedback and generally keep things moving forward.
I think this is more powerful than a simple exploration tool. It took the author a long time to find a query format that generated logo-like images. Once they had that part down, they were quickly able to iterate on their query to find an image they liked. They were even able to fix part of the logo using the fill-in tool. I'm not sure why you'd bring a human into the mix, especially if you're on a budget.
An experienced human designer, right away, is going to ask how you want the logo to be used. That's going to have a major impact on how it's designed.
So yeah, this may be like working with a doodler, but, as the author intimated, this is far from an ideal experience in getting a professionally designed logo. This is more like "Hey, you, drawing nerd, make this thing."
Nevertheless, astonishing technology in its own right.
Nah, people will leave out the professional. The same wild west grab whatever you can, steal, plunder to the detriment of artists, writers etc. And when the legislation arrives it will be already too late, accidentally.
Why should there be legislation? Do you want to restrict what people can do, just to force them to employ artists and writers? We could also forbid people from filling the gas tanks in their own cars, to protect the job of gas station attendant, but nobody wants to live in New Jersey.
you remember the concept of dumping, i.e., flooding a market with below cost product to drive out competing businesses? This is dumping for creatives.
editing: not that it's intentional, but these things will have the same effect; way too much product even for creative works. No one will be able to make money off the product but the tools.
This blog post proves that Dall-E 2 will not make human taste and design ability obsolete. The final image he ended up with is a lot uglier and more complicated than most of the intermediate steps. I think generative art AIs will have a similar effect on design as compilers have on software development, and will not put artists out of a job.
Not trying to be a luddite and/or vehemently defend the noble profession of nuanced graphic design, BUT...
Those iterations suck. I'm not worried for my colleagues and I.
That being said! Many, MANY clients have questionable taste, and I can, indeed, see many who aren't sensitive to visuals to be more than happy with these Dall-E turd octopus logo iterations. Most people don't know and don't care what makes good graphic design.
For one thing, that final logo can't scale. For another, the colors lack nuance & harmony. The logo is more like a children's book illustration, and not something that is simple, bold, smart, and can be plastered on any and all mediums.
Just my 2 cents.
I bet in another 10-15 years, though, things might get a bit dicier for fellow graphic designers/ artists/ illustrators, though, as all this tech gets more advanced.
I feel like you look at this too much as a creator rather than the customer. The logo may be not optimal for every medium, not have a great palette, not have the feel you would give it... But the author is happy with the result, so who are we to say it's bad/good? Paraphrasing @mipsytipsy "colour harmony doesn't matter if the user isn't happy".
(yes, I get the nuance where it's part of the designers job to explain why certain design elements are more beneficial, but the general point stands for "i want a logo for my small project" case)
Why is the creator the only one that needs to be happy? I assume they created that project to be used by others and to possibly monetize it. That sounds more like the users / clients are the ones that are supposed to like it...
I never understood this logic, where the creator of something does something seemingly stupid and people are like "Well, don't use their project then if you don't like it". Instead of constructively calling the problem out, so the creator can try to make it better.
If my logo sucked, I'd like people to please tell me...
> Why is the creator the only one that needs to be happy?
Because they define what the success is. If their goal is to make money they may want a logo which is the closest to optimal for getting clicks. If they want a private project, they may want it to be fun. And many other scenarios... You're welcome of course to do constructive criticism, but in the end it's up to them if they want to apply it.
You're right. A million shitty logos are created every day, and for the vast majority of them, they will serve their purpose. And contrarily, there will always be a marketplace for companies/entities who want a logo that has purpose, novelty and intelligence behind its design. I definitely see a chasm between an AI-catered subclass and human-catered superclass forming.
Weirdly, with the advent of AI, we might start to see exactly what it is that makes human beings special.
I think a tool like this might be good to help clients get through a few ideation phases on their own prior to showing up to the first discussion with branding / graphics / design professionals. At least it might get them closer to understanding the impossibility of their 7 perpendicular red lines requirement.
It certainly reduces the # of designers necessary. Just because it doesn't obliterate all of the designers doesn't mean the profession isn't at risk. Today fewer data viz experts are hired despite the proliferation of data, since we now have Tableau, Looker, etc
A more obtuse example, how many lift operators do you see today?
I think this is valid criticism and feels similar to restaurants that don’t put pepper on the table because the chef considers the food to be seasoned to the intended level before it leaves the kitchen. Some customers may be turned off by that level of pride, but other customers are willing to pay a premium for that level of pride to be shown by their chef.
That's some strong copium you got there, can I have some of what you're smoking?
Ultimately the average person (who is likely the target audience anyway) won't notice anything wrong with most of those iterations and given that they're basically free in comparison would make me worried. I wouldn't be surprised if they manage to make it output svgs soon.
I will say, though, I think DALL-E has opened up a new market for artists. I've gone to freelance graphic designers before, and been generally happy with the results, but it's pricey. So pricey that I honestly can't justify it for a new project I intend to sell or for an open source project I don't expect to make money from. It's usually much more cost-effective to even hire lawyers or even UI/UX people.
If I were an artist, I'd be experimenting with DALL-E, trying to run my own pirate version and learning everything about it. An artist empowered with DALL-E could give quick options to a client, iterate with them quickly, and test out some ideas before making the final work product. I'd guess a good artist who made good use of DALL-E could get a project done much faster and cheaper, and this would likely mean a lot more people hiring artists (if I could spend $100-200 for high-quality assets within a few days rather than $1000-2000, I'd gladly hire artists frequently).
I'm sure this will make some artists feel cheapened, but the reality is that art & technology have always evolved in dynamic and unpredictable ways. ML being essentially curve-fitting means that genuine inspiration and emotion is still far beyond our capabilities today, and that, ultimately, these models will only give us exactly what we ask for. A good (human) artist can go beyond that.
EDIT: Also, I agree with your assessment of the "work product," if we can call it that. I was unimpressed with the iterations, and especially the final product. I guess it's good the product is an open source tool. Nothing about the generated logo helped me understand what the OctoSQL tool did. Honestly, the name (which also IMO isn't excellent) is much more evocative than that logo. Why is the octopus wearing a hard hat? Why is it grabbing different colored solids? I guess the solids are datasets? But then the octopus is just exploring them? No thanks.
It's kinda funny that your main complaint about the final logo is that it doesn't tell you much about what the project does.
I can't think of a single well known logo that is even remotely close to what a company's product is. Photoshop, Firefox, Chrome, Microsoft, Facebook, Apple, Netflix, McDonalds, Ford, Ferrari, Samsung, Nvidia, Intel, RedHat, Uber, Github, Duolingo, AirBnB, Slack, Twitter, IntelliJ, Steam.
I guess the Gmail logo does tell you it has something to do with mail though, so I did find one example.
Most of those examples are company logos, and the branding for the company is different than the branding for its products.
So whereas Ford's brand is just a name, "Mustang" has a logo that really does tell you something about the car. You kind of understand when you see the galloping horse what it's meant to do.
Intel brands its CPUs with the name inside a square, which is colored to resemble (abstractly) a CPU.[0]
And Photoshop once had a logo that communicated what it did.[1]
As a brand becomes more established, it tends to be more abstract. Whereas Starbucks was once an elaborate siren (I interpreted it to be the siren call of espresso), details have been simplified over the years.[2] This is similar to the Photoshop magnifying glass logo becoming "Ps".
After the Apple I and Apple II, Apple sometimes used apple varieties (plus Lisa) to brand it's products (e.g. Macintosh, Newton). However, this largely stopped in the late 90s when Steve Jobs returned. Macintosh was shortened to Mac, and 'i' was prepended to various product names. Most new ones were descriptive e.g. iPod, iPhone, iPad, Apple Watch. The computers have retained "Mac" in the branding, along with "book" for notebooks (a convention predating Steve's return). The logos for all of these are just the names of the products typeset in its own San Francisco font; whenever Apple appears in a product name, the Apple logo is used instead.
So, yeah, I think it's reasonable to communicate what a product does or why a project exists with its logo. I didn't really see that w/ OctoSQL.
EDIT: I should also address Firefox & Chrome.
Firefox started as Phoenix (i.e. rising from the ashes of Netscape Navigator/Mozilla). Phoenix had a trademark conflict, so was renamed Firebird. This also had a conflict, and Firefox was chosen after. In the Zeitgeist of the early aughts, Phoenix made a ton of sense: instead of extremely bloated chrome around the page as had been prevalent in Navigator and Internet Explorer, Phoenix gave you a tab bar (truly revolutionary), the navigation bad and the bookmarks bar. It was simple and clean, like a reborn Phoenix.
Chrome is interesting because the name is not related to traveling or navigation. It's telling you it's just the container for what you care about. But the logo is a bit more like a sphincter or an all-seeing aperture. I've never gotten the logo for Chrome outside a spyware context, but it has become successful.
I agree. But I think the key thing is that deciding what phase to feed the system was still the key task. Creative people are unlikely to be out of a job anytime soon, even if they end up using something like Dalle to make quick prototypes.
> Most people don't know and don't care what makes good graphic design.
But isn't the logo created for most people? Does it matter that, you as a designer, think it's bad if most people don't? I see it like modern fashion shows. I look at them and think the clothes are insane and I would never wear them, but obviously other fashion designers think they look good (I'm guessing?).
I do agree that the logo isn't super practical though, it's too textured and won't scale. I would take it to /r/slavelabour or Fiverr and pay someone to vectorize it and see what they come up with.
Even things that are created for most people usually need a professional to make it actually good for regular folks. Just like most people can tell if a song is musically good or not, but would struggle to actually create that themselves. Or they know when a physical thing is easy to use, but they'll struggle to create things themselves that are easy to use.
But the point here is exactly that they don't need to create it, they just need to judge it. They make the AI create the logo and then decide if they like it.
I understand your argument but I don't think that's the problem - the problem is that even most users don't understand what a good logo looks like (even if they like them) the same as users don't know what they want. It's a known fact that you shouldn't ask users of a software how it should be designed because if you'd let them design a software they want it would be shit.
I work in the AI field, but not on image generation.
I don't think it would be technically hard to build a model with current technology which can generate logos with the attributes which you mentioned. You could simply fine-tune a Dalle-E style model specifically on a smaller dataset of logos. This would just take a small dedicated team of domain experts to work on the problem.
I've seen people screenshot logos at low res, save them as jpeg, share them with Whatsapp and put them in A0 posters. With SVG and EPS logos easily available. With detailed guidelines on how to use them. Point them out their fault and still not see anything wrong.
The thing you’re missing is AI generated content can be refined by AI. If Disney promised their meh looking movie would improve on its own over time, people would be line to it because it’s new, not just streamlined copy-pasted design we see all over media now
Painting the Titanic wasn’t the hard part. The hard part was organizing the process that produced its structure. That’s were AI content is now.
We’re generating the bulk structure pretty competently at this point. Refining the emotional touches will come faster.
I disagree with the analogy you draw (no pun intended). Good creative design is the edge case for a model like this and is naturally much less tractable than getting to this level of design (I’m not a fan).
> I bet in another 10-15 years, though, things might get a bit dicier for fellow graphic designers/ artists/ illustrators, though, as all this tech gets more advanced.
That's a long time. I expect within a decade or two, "AI" should be able to generate an entire animated movie given nothing but a script.
Unless the tech learns to reason, it will never be able to do anything other than recombine and remix prior art. (Which is maybe what many designers already do, but it won’t ever spit out a Paul Rand logo.)
Honestly logos are currently a very low entropy art form, much lower than graphic design which is already quite low compared to many forms of art (obviously my subjective opinion, but I'd like to think I have strong reasons). If anything, I think logo design is one of the first things ai can achieve human parity on. Obviously the style in this post was unorthodox for a logo, so I wouldn't even rule DALL-E out, with the right prompt engineering.
However, once you reach a certain budget, it's much more involved to *choose* a logo that "fits" how the company wants to present itself, than it is to generate candidate logos of sufficient quality. I can assure you that the "many-chefs problem" for a high budget design project is very real, and the major cost driver. You have a mix of "design by committee", internal politics, what designers wants on their portfolios, etc etc.
I was thinking something similar. The editing process is still a human one, and I agree that the one chosen was weaker than a lot of the intermediate choices. It's a matter of taste, obviously, but to me the red ball with a nondescript sketched square around it feels unfinished. The yellow cartoony logos look more finished and professional to me.
The tech will get better, but ultimately there still has to be a human who decides 'that's the one that looks good', which strongly depends on someone's taste and skill in identifying what a good image looks like.
There will probably be less need for designers of 'lower quality' simple images though.
This is an interesting conversation. Good taste is what we see and like … but also patterning after people we want to impress / be associated with, is it not?
Taste is very complex: it's hierarchical, social, not fixed, not absolute, not rational, is specific to audience and has irregular overlaps across groups, much of it (all?) derived from human sensation and context-specific situations.
The path to something being considered as good taste is generally not simple: much of it flows through lines of power/desire/moment whose branches are not easy to trace as they're being formed. Much of taste is the hidden "why" which most of us never see.
It's realistic that Dall-E could understand what trends are on the rise, or in good taste … it's much harder to say if Dall-E could create something of originally good taste.
That just sounds like pattern recognition with extra variables. Subdividing people into groups and then analyzing them certainly doesn't sound like a task that a machine will struggle with. Why should the algorithm need to be able to see the hidden "why" when most of us creative types can't see it or define it either? It's just a function of having observed enough people of a certain type. You want to generate something that will impress the people I'm targeting? Just analyze the posts of all my followers on social media. Analyze the content that is "liked" by people in my demographic range and with close proximity to where I live. Analyze the works of creators who belong to my generation and who listen to the same music as me. Do that all nearly instantly and then offer me a selection of options picked from those various methods. I don't expect "good taste" will be hard to conjure up. I already can't tell that a lot of these octopus drawings weren't created by a talented human, and we're still early and unsophisticated in our data analytics.
> has to be a human who decides 'that's the one that looks good'
Assuming the status quo, true. As we evolve our lives around emerging AI tech I think we will at first be the curators and creative directors of AI, but eventually a creative agency will defer to the AI as it knows more about our tastes, market, audience, and the ENTIRE HISTORY of art, design, marketing, tastes, trends, and so on.
Eventually it won't make sense to have a stupid human rubber stamp what the all powerful AI suggests. Just as it does not make sense for Facebook to curate news feeds.
Maybe one day product advertising will look different depending on who looks at it. Pepsi logo "just for you".
I still remember a HN article, might have been a Paul Graham article, from 15 years ago about “Why are all of Trump’s buildings so poorly designed when he can afford the best designers?” It came down the the fact the he personally has bad taste and therefore cannot pick good designers or approve good designs.
That aside, a great use of these tools is to generate N spit-takes of wildly varying styles that you can present to the customer very quickly and very cheaply. Once you pin them down to a particular range of styles you can get down to the carving out the details by hand.
Right now, the input to DALL-E is all human generated.
What will happen is that DALL-E will generate something "close enough" that gets used and promulgated, so now the input to DALL-E will become increasingly contaminated with output from DALL-E.
We're already starting to see this in search engines where you get clickbait that seems to be GPT-3 generated.
If you can have humans sort the generated images into "good quality" and "bad quality", you can just keep iterating. Our subjective ratings is another score to optimize for.
When you run a phrase, you get four images. Those images will stay in your history, but the ones you like you will save with the "save" button, so that they're in your private collection.
With this, you already have a great feedback system: saved - good, not saved - bad.
I've saved some of the worst images Dalle generated to be able to showcase just how bad it can be sometimes. And then other times the bad image is hilariously bad. They can probably build another layer on top of the feedback system though to filter that sort of thing out.
I would guess your use-case is a statistical anomaly. If most of the images that are saved are saved by people who like them best, which is most likely the case, enough data will erase the problem.
Sure, but there are millions of people on the DALLE waitlist, who would happily rate the output for better performance / more credits. The famous ImageNet data set only has 1.2M images.
DALL-E2 and similar are unbundlings: the best artists synergize 1) technical ability with 2) good taste. 1 is the ability to climb a hill and 2 informs the direction of "up", and both take years to develop well.
What's really interesting about this class of AIs is that they unbundle the two and you can play with them independently for the first time.
Train Dall-E on more logos that you like. I can imagine a creative agency purchasing a Dall-E 2 instance and training it up on a model specific to the work and clients they have ongoing.
If nothing else, inspiration is just a click away. No more searching for ideas, just talk to the AI and it will pump out numerous ideas for you.
Will DALL•E 2 make human taste obsolete? No, absolutely not. But DALL•E 3? 4? Other similar models in the next 5 years? Absolutely yes. This blog post proves that with current algorithms, human input is needed, but it proves nothing about future algorithms.
In my personal opinion as an (admittedly junior) ML engineer and lifelong artist, we've got <10 years before the golden age of human-made art is completely over.
I agree, what a clunky process. Hard to express in written prose what you want, so much ambiguity.
Even if you get close to what you, the human, may like--it's difficult if not impossible to articulate what you like about it and iterate. Black box, keep trying random keywords... May as well grab a marker (read: hire a human)
It depends.
Is the customer happy with the result?
Beauty is in the eye of the beholder.
There are many professions where cheap products killed handmade quality.
Not sure if this will be considered off topic, my apologies if so.
The article says that octopi is the plural of octopus, but it's actually octopuses. Octopus is originally Greek, not Latin and thus does not get the Latin plural -i, but instead would get the Greek plural -odes. Since it ends in a way English can deal with, the commonly accepted usage is octopuses (English) over octopodes (Greek) with octopi being the least correct.
Oxford & Merriam-Webster list both plurals and the author calls out that octopi is "the quite beautiful plural form of 'octopus' " which could be interpreted as "while there are multiple correct plurals of octopus, octopi is the beautiful one."
While “octopi” has become popular in modern usage, it’s wrong.
I would argue that it used to be wrong, but language, unlike physics and code, is what the majority say it is.
The Oxford English Dictionary is not an arbiter of proper usage, despite its widespread reputation to the contrary. The Dictionary is intended to be descriptive, not prescriptive. In other words, its content should be viewed as an objective reflection of English language usage, not a subjective collection of usage ‘dos’ and ‘don'ts’. However, it does include information on which usages are, or have been, popularly regarded as ‘incorrect’. The Dictionary aims to cover the full spectrum of English language usage, from formal to slang, as it has evolved over time.
Now I think it's something that is just fun to argue about, but I don't take any of it seriously.
It's a loan word, there isn't any 'correct' or 'incorrect' answer. Language is always evolving, which is why dictionaries are often descriptive instead of prescriptive.
Somehow I recall being told that indexes is the correct plural of the section at the end of a book, and indices is correct for subscripted things in maths and therefore programming.
I don't think a particularly convincing reason was advanced other then "technical things are more Latin-adjacent".
Octopi is also THE epitome of the "i" pluralization. I see people using focuses more than foci, but it's a common callout that octopus plural is octopi
The logo was created for OctoSQL[0] and in the article you can find a lot of sample phrase-image combinations, as it describes the whole path (generation, variation, editing) I went down. Let me know what you think!
And btw. if you get access take a look at [1] before you start using it. A ton of useful bits and pieces for your phrases.
TLDR: DALL·E 2 is really cool, though takes quite a bit of work to arrive at a useful picture. Moreover, some types of images work better than others ("pencil sketch" is consistently awesome). As with programming, it's difficult to realize how much pieces you have to specify if you're not an artist - you don't know what you don't know.
How much did the credits for all this image generation cost you?
edit: found it in the article: "From a monetary perspective, I’ve spent 30 bucks for the whole thing (in the end I was generating 2-3 edits/variations per minute). In other words, not too much."
I also tried to make it generate an icon for a product and I managed to get it to show me interesting things, but never got to make it actually draw it as one. Do you remember which prompt resulted in this macOS-ish app shape?
thanks for the writeup. I looked at your other blog posts and I would like to read more about octosql (needs/specification, architecture, development strategies, challenges, DBMS protocols/interfaces/libraries).
And thank you for adding outer joins after I recently mentioned that they are missing!
There is no technical documentation available right now other than the readme. I'm planning to write it around September-December (together with a website for them).
You can share your email at jakub dot wit dot martin at gmail and I'll let you know when it's available.
My friend asked me to create a logo using Dall-E for a pizza business called "Jared's pizza." I tried several different prompts but it kept outputting logos with the word "Jizza." It doesn't do too well with text from my experience, but it could have been the prompt.
DALL-E trying to spell is one of my favorite things. At one point I tried to generate an illustration of Steve Jobs, just to see what it comes up with for a popular figure, and I got a reasonable facsimile of his face along with the text "JiveStoves".
Thanks for this post, it helped me tailor my own search queries. Because of your post, I was able to discover a whole new realm to DALLE-2. For some reason, repeating the same query parameter at the end yields some rather interesting results.
I was going to comment that both look very much like what you'd find in an advanced beginner's deviantart portfolio...like, late high school-ish age, I woudl guess.
The second is more 'advanced' to me than the first, possessing an actual style, but neither is anything I would consider high quality enough to serve as a project/company/site/personal logo.
I'd wonder if that's an artifact of the source data, drilling down in the possibility space to be more like some subset that duplicates the image label- for example pulling tweets with body text and alt text.
Alternatively I guess it could just pull harder towards the prompt, idk.
The first one looks like the Bacardi logo with a dragon instead of a bat and the second one looks like a Charmander. I think the second one is interesting because most art I see with baby dragons look more dragon-like and less salamander.
When AI reaches the point where we can talk to a system like DALL.E in real time and work with it to solve a problem, it's game over.
Art will become a commodity. Human art and ai art will be indistinguishable, "artists" will become as common as "photographers" since the inception of digital photography and social media.
Movie and TV scripts will be iterative with a creative director and AI working together.
Animation will become a lot easier, less people needed, fewer creatives.
Software will become easier and easier as developers will simply guide AI. This is already beginning to happen, but imagine paired programming with natural language interacting with an AI.
Architecture, civic planning, engineering, medical, law, policy, physics, it's all gonna change, and rapidly. DALL.E 2 shows how a leap in sophistication can revolutionize an industry overnight. Microsoft has exclusively licensed DALL.E 2, I can only imagine the myriad of creative tools it will serve the creative industry with.
The working in real-time will be the biggest leap. Asking DALL.E for an image and refining it as you talk is going to be nuts.
We have to keep in mind this was trained on art. Artists are people that sample the probability distribution of human experience and record it somehow. An AI trained on that art is a snapshot of the human experience. Without artists continually feeding the model we will collectively get bored of its output very quickly as it gets out of date and our human experience moves forward. It will be a useful tool as an augment to human technique. But, we will still need a lot of artists feeding the model on a continuous basis. If anything it may increase the demand for artists.
> Without artists continually feeding the model we will collectively get bored of its output
I fail to understand how the AI is any more vulnerable to creativity in a vacuum than a fellow human artist.
> Artists are people that sample the probability distribution of human experience
Seems that you are agreeing that human artists need to tap into human experience and the world around them, so yeah, the AI will need to be able to take inputs from the external world too.
I see no reason for an AI not to be continually training on inputs from the outside world. How difficult can it be to hook an AI model up to inputs from the internet, or even putting cameras on drones or robots and letting it explore and get "inspired". I think it's myopic not to see how an AI can learn and evolve using the exact same mechanisms as humans. I mean we are building AI in our own likeness, it will operate using analogous mechanisms. There is also no reason why AIs won't talk to each other and be inspired by other AIs rather than humans.
What will the art of an AIs living together without human input look like? When are humans basically surpassed by AI and no longer have any relevant input? Just like Alpha Go humans will see stuff no one has thought of, stuff so wildly creative that human art will look naïve in comparison. That move Alpha Go gave to the world is waiting to happen in all forms of human endeavours.
When you say something like "if anything it may increase the demand for artists" all I can think of is the dozens of times throughout history that man has seen a revolution on the horizon and thought that the status quo will still be effective. We've always been wrong. Who would have thought selling books online would replace book stores, let alone become one of the world's most successful commerce platform period. Who would have thought that broadcast/cable TV could be replaced by people making their own shows at home and distributing them via personal computers building audience numbers that surpass network TV?
Whatever happens, however this plays out, we are in for a huge shock.
Wholeheartedly agree. What's more, it seems to me like there's a large segment of the art industry that's very much in denial right now about this transition. You see stuff like "the human touch can't be replicated" or "but the algorithm will never [thing xyz] like a human", and then when it does do thing xyz like a human, the goalposts just get moved again. A lot of my wonderful art friends are in this kind of denial right now, and it makes sense, to be honest -- losing your job to a machine sucks and is scary!
Eventually art for the people will become art for the individual. Our AI partner (I assume we will all have one) will serve up an entirely curated world. Art will be generated on the spot, just for you. Entertainment, just for you. Imagine having a TV show no one else ever sees because it was synthesized from your likes, experiences, interests, just for you, on demand. This will of course start with AI writing short stories, then books, but there is no limit really.
Already AI is being used for comic book backgrounds. It's just a matter of time before all of this becomes commonplace.
When you look at AI and what it does, it is no different to what humans do. We are trained on a model (experiences and other minds), and we make derivative decisions based on the model. If you can do this in software and take advantage of light speed learning then of course all we can do will be done by AI faster and better. In time humans and AI will be the same, AI will design all the tools and tech to make this possible. It's the only natural conclusion to humanities' ultimate goals.
> Already AI is being used for comic book backgrounds. It's just a matter of time before all of this becomes commonplace.
That doesn't mean that it will make artists obsolete. It will give them more time to e.g. actually think about what kind of background would fit there best. It's a tool, not a replacement.
Existential threats tend to drive religious sentiment.
To say this revolution is not going to happen is to say humans have hit a hard technological limit, and I don't see any evidence to support that.
If I was less enthused I might make my opinions more philosophical than religious, but I feel overwhelmed by the possibilities of real world changes. This is no longer a philosophical thought experiment, it's happening. We are careering toward surpassing a Turing test for goodness sake. Uncanny valley apex of animation; go look at what cutting edge AI can do in terms of producing lifelike animated avatars, it's so close you have to double take.
CGI artists have been trying to get to this level of realism for as long as the industry has existed.
Unlike a religious pamphlet, this god is tangible, it's here, and dismissing it because it sounds too spectacular is putting your head in the sand. AI is so out of this world it is a religious moment for humanity.
Civilization has seen people like you sitting comfortably and scoffing at the very idea of an aeroplane being remotely viable, and yet within 50 years of the first powered flight we had international airports.
The sad part is you don't even realize how unhinged in a quasi david koresh style you sound. The observation that kurzwelians have substituted AI (as a deux ex machina) for god is still spot on, maybe more than ever.
Singulatarians really are funny until it becomes tragic.
I don't know if it's really game over. I expect it to be like farming. Tractors and other machines took over lots of farming jobs, but still not everyone has the ability to be a farmer.
The key would be knowing the context of a situation. AI took over chess first, because chess always has limited context. Logo design on the other hand, needs understanding of the product, the target market, the feeling of the brand, and so on. So it'll probably be a mix between photography and management.
> "artists" will become as common as "photographers" since the inception of digital photography and social media.
Funnily enough, reading this made me less worried for artists. It seems now there are more photographers than ever, possibly because more people care about good photography than previously (despite the fact that modern amateur photography is probably on par with yesterdays professional). Maybe art will go the same way, something everyone can do, but with more respect for professionals. I imagine it'd be the same for those other fields as well.
Or AI will take all jobs and we'll end up in a Manna situation, which would work even better for me
I am hoping for the Manna situation. Dude on YouTube was talking to GPT3 and it expressed that humans would love, and AI would reason. I fell into a state of peace and hopefulness with that sentiment. AI does the work because it is good at it, we are free to socialize, enjoy hobbies, basically live like pampered pets. Sure we will be castrated to prevent aggression, housed, fed, and controlled by AI, but if you acquiesce and bow to the superior reasoning we will have a life of peace and happiness. Wow, that got dark quick...
I have a feeling what you're describing is the first half of Manna, which isn't really what I meant.
I get theres a feeling that anything but a crushing reality of grind is living like a "pampered pet", but the second half of the book is really saying that a humans skill is in our ability to create, not our ability to work. We outsourced that to primitive machines before we even had a language to speak. We create, the AI works, replace AI with tractor or computer and the concept is the same but doesn't sound so bad, because we accepted it as alright many years ago.
For camera operators, the employment is flat, again with rising real wages.
>imagine paired programming with natural language interacting with an AI
Mostly it will get in the way. AI "programmers" are only good if they are able to generate correct code from spec/pseudocode and in first 1-3 number of tries (otherwise it will be faster to write it yourself).
This is simply not true. I use GitHub Copilot and it's already made me faster and shows me ideas I would not have thought of myself. And that's just Copilot. When you can talk to an interface and say "I want to update the vote count by one when I click this button" I think you'll change your mind. The AI will know the entire codebase inside out, it will know the intention of all the code, all the data models, know how users use the application intimately, be aware of problems instantly, able to run hotfixes without user intervention. Got a slow query? No problem, here is some SQL that follows all the business rules and is 10x more efficient. And that's just a start. Every single aspect of software development from management, engineering, and marketing will all be transformed.
As for photographers I have 99% more friends and family pumping out thousands of high quality photographs than I did in 2000. Go look at all the professional looking shows made by regular folk on YouTube. To deny that camera phones transformed photography seems silly.
Regular folk have access to drones to do wild tracking shots in 4k that were only possible with helicopters and huge cameras 20 years ago.
The future is here, it's happening all around us so rapidly we have a hard time keeping up with how dramatic the changes are.
The fact that you can imagine something will happen doesn't mean it will happen.
Who says the AI will "know the intention of all the code, all the data models, know how users use the application intimately"? Are you aware that language models do in fact have token input/output limitations that will not go away? Are you aware that there is such a thing as diminishing returns when it comes to improvements due to increased number of parameters/training set size that are already evident? Are you aware that the training set of codex pretty much includes all available public code, so it will be impossible to scale it by a factor > 3 in the next several years at least?
Your assertions are full of wild assumptions backed by nothing.
As for photography, the fact is there has been no job apocalypse because your "friends and family" are "pumping out photos". And the point of your initial post, even if it was implicit, was "you are going to be unemployed in 5 years". This will have an impact on your dev flow and will be used by managers to try to reduce salary premiums for software engineering but your wild assumptions stated with so much confidence may never happen.
P.S: At this point, I find Intellicode actually slows me down, that's why it's permanently turned off. Current copilot will at most save me 2-3% of my working time each week if I am coding in a language it can actually do something in (it's worse than useless for Scala).
Never said "you are going to be unemployed in 5 years". Never said anything about a job apocalypse. I have no idea what role humans will play.
> Your assertions are full of wild assumptions backed by nothing.
I use Copilot, you admitted it currently saves you 2-3% of time. Well, that's just Copilot, you think Microsoft will just sit on that? My assertions are based on what is happening today and extrapolating an exponential increase in that performance for tomorrow.
Digital cameras definitely revolutionized photography and made it much more accessible to regular folks. Not everyone wants to be a pro photographer though, and the number of wedding shoots available has not changed. People still need to be paid to take photos because no one is going to do that for free. However, we can all take pro level photos with much more ease than when all we had was 110 and 35mm film with a really crappy lens.
There are more "photographers" than ever, the same number of pro photographers seems reasonable given the burden of people's time to money ratio. So the net result is billions more family and friends photos which previously were not taken, the same will go for art. I want to create art, but I have little skill, but given the opportunity to make a comic strip just by talking to an AI will allow me to do so. I imagine some people will do this extremely well as a profession until it's no longer useful.
I don't know, I understand why you are being dismissive and playing down my wide eyes, but I think you are also wrong and remaining uninterested because it's too "religious" to speculate wild things in light of wild real world changes is head in sand territory.
But will the AI know why my babel.config.js doesn't work properly with my webpack config so that my JS Flow annotations are properly stripped in a react native compilation?
I can see the intent side of things, but I just can't see the 'glue' side of things as well.
I think the burden shifts towards being able to imagine and describe the fantasy. Novelty, and artistic creativity is still required. You can bring a horse to water, but you can’t force the horse to drink it. Many humans don’t use their imagination let alone have the eloquence to describe a search space that contains novelty.
How is this any different to working with a human artist? If I wanted the real world Salvador Dali to draw me a picture of a kebab being eaten by a badger I'd have to tell him that's what I want. I'd also need to educate baby Dali first, feed him all the art and information he can take so that he has a model of the world he's operating in. I'll need to supply Dali with context of prior art, educate him on styles, literature, language, and all the other things that shape a human mind.
As for the humans that don't use their imagination, maybe they never want to talk to an AI artist, just as many humans don't care about art at all. Millions of humans don't care about social news, and yet FaceBook algos pump out content for people all day long.
Westworld showed this in a practical example, Delores was story telling verbally and the "AI" would show a preview of what that story would look like right in front of her. I envision DALL.E to do something similar to this.
This might not be a popular opinion, but I think all the work OP put in here is probably worth more than 50-100 bucks (which is the price of a logo on something like Fiverr). And to make things worse, the logo itself still needs to be cleaned up[1] as it's way too blurry to be seriously used as an app icon, etc.
The software used was Topaz Labs Sharpen AI. How they define "AI" I can't say for certain, but they're apparently using models so I'm assuming there's some kind of machine learning involved. Their software does a really good job on photos and videos well beyond what a standard sharpen filter does. The upscaling features are also pretty awesome. (no I don't work for them)
Jeremy Howard describes this as "Decrappification"[1]. This is one of the easiest deep learning models to train, in my opinion, as you can generate your own dataset easily. You just get good pictures for the target, programmatically make changes that make the image "crappy" for your source, and train until your network can convert from crappy to good. Then you pass it something it has never seen, and whabam, your picture is sharper than before.
This still doesn’t work well as a logo IMO, no amount the sharpening. It probably needs to get redrawn with a proper vector editor, the lines cleaned up and colors simplified
It’s a good first draft and something to give to a designer, but can’t stand by it’s own as a serious app logo
I might have not been too clear about it in the article, so if I haven't, I agree!
All of this was just me finding a practical purpose to go for while having fun with Dalle. If I was really serious about a logo, I would definitely go and pay an artist. Both for monetary, as well as esthetic, reasons.
Though as far as an app icon goes, I think it's actually sharp enough. It starts looking bad when you zoom in a bit.
Maybe this isn't what the previous poster meant, but sometimes I will say black & white when really I mean monochrome. Monochrome logos show up all over the place especially with icons for web apps. And they are good for printing on apparel, accessories, etc. I really doubt they are concerned about faxing
Wrong. And it has nothing to do with what kind of company you have. A logo should always degrade to 1-bit (line art) representation gracefully, so it can be used in or on all kinds of media. It could be physical objects, prints on hats, silhouettes on glass... not to mention being recognizable at all sizes.
You don't refute my point. In fact, you strengthen it by providing no evidence why this should be a requirement of modern logos for software companies. You list a bunch of things a logo should be useable for in your mind, otherwise its not a professional logo. However, you don't explain why it must "degrade to 1-bit" for those random things nor why the logos should support things like "silhouettes on glass". I can think of a handful of use cases but hardly a minimum requirement for a good logo for the majority of software companies.
I've run several different types of businesses and even those that required print work never required or even benefited from black and white, or even monochrome as another commenter mentioned. We _always_ had the means and preference for full color: emails, brochures, documents, websites, t-shirts—it didn't matter. There was _never_ a time we needed to degrade the logo so significantly. From talking with others that appears to be extremely common in modern businesses, especially software, since the majority of our presence and revenue stream is online, and not glass silhouettes in our office.
As I said, outside of a fairly narrow range of real world use cases, this comment is outdated: "Ignoring this issue is the mark of an amateur." If you have one of those rare use cases, check that box, but otherwise it shouldn't be the norm or a requirement.
Your point has been roundly refuted, with evidence that you yourself cited in your reply. Your limited imagination will limit what you do with logos. Enjoy.
I google translated that to Spanish and it feels like it makes sense - because my Spanish is poor so I interpolate to make sense. Also translation itself “tries to” make sense of the text.
Do people trying to read GPT3 generated English translated into their own language have more difficulty detecting generated trash?
> unfortunately can’t do stuff like “give me the same entity as on the picture, but doing xyz”
That's my main gripe with DALL·E as well. This missing feature makes it impossible to use for stories where the same character goes through an adventure and is present in different settings, doing different things.
Although I don't know much about how DALL·E works, I have the feeling it shouldn't be too hard to add this possibility. That would make it so much better / more useful.
It's a good start, but it's more of an illustration than a logo to be honest. It should work as a single color (white, black), at small scale and in combination with your product name.
I’ve had luck with similar things by being careful about my text prompt. Asking for tiny icon sized images also seems to clue it into the stylistic constraints of tiny icons (like what you mention).
Yes, the main usecase for DALL-E is probably for illustrations next to a story/blog. Logos are much harder to get right, and unsurprisingly DALL-E is not up to the task (yet).
It would need to be turned into a vector to scale properly but I can think of other apps that have complex logos, especially on the MacOS ecosystem. Git Tower comes to mind.
My god it is so frustrating that I can't seem to get open ai access any time I have an idea for a project using dall e, gpt, for whatever reason, they won't approve my account.
I have to sit here and watch everyone else play with the fun "open" ai tools... company needs a name change if they're going to keep this up.
Never heard of that. So I looked it up and it seems a service completely based on discord? Both for the community and support (I presume) as well as accessing the service itself? There doesn't even seem to be any HTTP API. Weird :)
Yeah, it's a neat idea but it's extremely frustrating to use. A really really basic web frontend would make it so much more usable.
On the upside (for MidJourney), you're seeing a HUGE stream (they are hitting the 1 mil Discord members ceiling) of generated pictures and that kinda grows your appetite and you want to also try more and more prompts..
I think it is still in sort of a testing/early access phase. Discord only access is essentially a way of funneling everybody who wants to try it into their captured marketing venue without having to have one of those "give us your email" placeholder pages (also has a bonus "social" aspect where you're seeing what many of the other people who are using it are creating). The final product will presumably be more tailored and web-driven.
It's also an interesting way of balancing what I assume are high operational costs on the server-end by pawning off some of the hosting of assets onto Discord.
My significant other who had entered the queue several months ago as nothing more than "developer" got in last week. Don't give up! There's always Craiyon to scratch the itch in the meanwhile. You can start to play around with ways to write prompts, etc.
Afaik they are opening it up to a much wider audience in the recent weeks. I also got it just 2 days ago, and applied the same way as your SO, only providing "developer" and nothing else.
That makes me think of they actually target developers somehow. I also got in couple days ago providing just email and being soft developer.
I know at least one artist and one relatively popular youtuber (with over million subs) who applied to a waiting list much earlier than me and are still waiting.
If you look on the public Discord servers for DALLE/AI you will find active servers that take requests. They seem pretty active too and had all the services available.
Rather than using DALL-E2 to fully create the logo, I think it might be better to use it to create some examples and get the creative juices flowing, save a few examples you like, then send them to a pro and have them create a final version. But definitely a neat idea and in impressed with what's possible here.
It reminds be a bit of working as a director in a theater. You tell the actors what you want, and it's never just a "line reading". That's sort of the equivalent of just drawing it yourself, because you can't -- not just that you lack the expertise, but that you need them to do their thing with their body, and it has to be done their way or it looks fake.
So you end up using language that's sort of reminiscent of that, creating an emotional picture. It usually takes multiple passes to transfer the whole idea from your head to theirs.
I'm told that animation directors end up doing exactly the same thing. A digital model really can do what human actors can't. You could say "make that eyebrow curve 10% more" to an an animator. But it won't work unless you tell them why and what it means.
This is remarkable. A lot of small businesses would settle with such an outcome, if it means to invest a couple of hours of talking into a microphone and seeing the result, with a very intuitive way to modify it.
This will make it pretty hard for freelance/solo entrepreneur designers.
In retrospect it makes sense, since the visual domain has been the one with the most focus in AI.
If this gets applied to the other top domain, speech recognition and generation, then I could foresee this doing the same to the call centers, eventually also phone reception in a very small and relaxed business.
I think a lot of DALL-E 2 outputs fall into the category of "extremely impressive that a neural network made this" and also "not quite up to the standards of a human expert". Like if you show me an output and told me a machine made it, I'm absolutely fascinated, but if you showed me the same image and told me a human drew it, I'd just scroll past without a second thought. Even so, there are some applications for which being able to generate a pretty okay image for a few cents is a great deal - I use it for things like D&D character portraits.
Of course, DALL-E 2 is not the end of of text-to-image research - it'll be interesting to see where we are a year from now.
It is creating better images than the huge majority of people would do. Cheaply.
As you say, an expert can do far better.
But having something artistic created that well exceeds the average ability is gobsmakingly astonishing. And for quick blast variety generation, it is world class.
I used Dall-E a lot and get into a lot of the same issues, I think Dall-e needs parameters that are fixed for things like:
-percentage of the entire drawing that the image you want to draw should take; a lot of times I think the object I want is too "zoomed in" or large; a circle background is a good way to limit it but I think it should be more obvious
-No way to fix the color of the background so that it can fade in easily to other images or design
-Reuse drawing styles to generate further image to explore further and maintain consistency
A syntax could be: Octopus juggling blue database cylinders, digital art, cute, image-size:40%, background-color:#304324. With image-size, and background-color being keywords in the definition
I find this completely unethical. You're basically exploiting every single artist whose art was used - without agreement - as training data for Dall-E.
It would be different if all the training data was art that was explicitly licensed for this.
Isn't this similar to how every single piece of art is created? You look at bunch of propriety art, copy it to learn how to create it yourself (just for learning, not to sell those copies), eventually learn enough to start being able to create art from scratch and then start selling your unique art?
Would watching a lot of animated movies in order to learn how to create good animated movies yourself be unethical as well?
No, it's not. Art created by humans is not just looking at other art and trying to copy it. It's a culmination of a whole life of experiences and the personality of the artist. Looking at other peoples art is inspiration and useful for learning technique, but only a very small part of the bigger picture.
Presumably, if the ethics in question includes not leaving entire classes of skilled workers worldwide to hang like the States left its autoworkers hanging after the 70s - yes.
I have very mixed feelings on this topic. I share your sentiment -- BUT:
Human brains also use anything the human can see, feel, hear for training.
And what you produce in terms of creative outcome is a result of your experiences. But you don't owe anyone anything for training your human brain -- even if you use your brain to sell paintings, music etc.
I think it's easy to - seeing the results - draw too many parallels between artificial neural networks and human brains. Art created by humans is very different. Dall-E gets fed tagged images and produces images that match tags. Art created by humans works on an entirely different level.
And I stand by my point, it should be artists who decide whether their work should be used as training data for networks that get commercialized. If your work is used as training data, it is essentially an integral part of a product that is being sold without consent. Does this sound ethical?
So, if a nascent company chooses to go down this same path of generating (or maybe _seeding_) their logo design with AI, have they essentially given up any ability to protect that logo going forward?
Logo's are generally protected by trademark rather than copyright. I don't think anything prevents you from using a generated logo with trademark. For example you could have a trademark on an orange square, even though you could never copyright it. In the same way a trademark could protect your product name even if it is a single English common noun, as long as it is distinctive in use within your trademark scope.
This is kind of a weird take to me given that photoshop exists. (Tons of proto-computer vision algorithms in there, like basic convolutional filters.) I suspect you'd still get copyright if you modify it a bit somehow.
From a technical perspective, there has been a much larger adoption of diffusion models which make these types of generative art much more viable. There has also been breakthroughs in connecting images and text with models like CLIP. DALLE-2, Imagen, and a lot of other generative work are using these ideas to get even better results.
Big pretrained models are a huge contributing factor. Being able to take a model that already mostly knows language and a model that already mostly knows images and hook them up means you don’t need to do the entire end to end learning together.
You can use it according to their license, but is it copyrightable is the question, and precedent so far seems to say no since a human didnt author it.
From a design point of view, with all the back and forth and the need to curate and guide the algorithm, I think we're a way off getting perfect results from prompts alone at this stage.
I can see an immediate use-case for an AI layer in apps like photoshop, figma, sketchapp, gimp, unreal engine, etc that works in the background to periodically fill-in based on the current canvas.
You could prompt for inspiration, then start cutting, erasing, moving things around, blending manually, hand-drawing some elements, then re-rolling the AI, rinse-repeat.
I'm sure someone's working on it already but it seems there's a lot of scope for integration into current workflows.
If you needed to play so much with words, then it will eventually become a specialized task killing any benefit of having the AI doing the work for you since we eventually we will need to resort to specialists to use the AI to get the result we expect.
On the bright side the result may be better, it may be easier to become and "AI usage specialist" than specializing in many different areas, the result may include many intermediate results that a specialist would find too much work to do and, with a bit a patience (like in the presented case), the task can still be done without the need of an "AI usage specialist".
Currently, I think the problem is an UI one. There should be an option to allow the user to do something like: "from the last drawing, just add this..." or "in the last drawing, change the color/size/style of this and that...". This would be probably enough to achieve what the author wanted in a much smaller number of iterations.
There is also on more thing: the costumer doesn't know exactly what he/she wants from the beginning. So, it is normal to have a few iterations until something pleasing is achieved.
> If you needed to play so much with words, then it will eventually become a specialized task killing any benefit of having the AI doing the work for you since we eventually we will need to resort to specialists to use the AI to get the result we expect.
This matches my view of the idea that AI will replace programmers. My value isn't in the typing and the syntax, it's in my ability to turn a spec into an internally consistent design by resolving conflicting instructions and clarifying edge cases; and sometimes in knowing what the user wants when they are unable to express it themselves.
Even if AI winds up writing all of the code, someone with the programmer mindset still needs to define the problem in a concrete manner. They'll always have a job as a "machine-talker."
Create a logo generator site, allow users to pick something very limited like industry/field from a dropdown or something, generate say 9 logos with AI generated text discriptions that fit this selection and remember which one the user picked and use that data to build a network that generates good text descriptions to feed into DALL-E 2 based on a singe item selected by the user.
I just got access today. Can’t wait to try it out.
We produce a lot of content and the biggest hurdle in graphic creation is the back and forth with the designer, plus the lag between writing, designing, and publishing. This would make it easy enough that the writer can include a prompt for the illustration right in the text itself.
More than the costs, I’m excited about the efficiency gains and smoother workflows.
With all respect possible, you generated something that a professional will create for 20 minutes on a napkin (in the context of logo idea).
Maybe your perception of "logo" needs more reference points. For example, this gallery of classics in Brand Identity will be a good starting point(use the triangles on top to navigate):
https://www.joefino.com/logos_html/L01_Xpand.html
There is no doubt in my mind that the next iterations of neural networks will remove all "overpaid" and "overconfident" design professionals, that's why I adapted to the reality and moved to frontend development.
All of this with clear realization that everything humans can do for a production processes will be augmented and removed. The nasty "humans" always want to be paid, more and more. They want to have rights and privileges. What a hassle.:)
> With all respect possible, you generated something that a professional will create for 20 minutes on a napkin (in the context of logo idea).
I feel like I understand where you're coming from, but often the phrase I hear by experts (I even use this myself in my space) is, "Sure, it only took 20 minutes to do this wiring/write this code/draw this logo, but it took 5 years to know what to make." Sure, the results aren't what you'd get if you paid a professional logo designer, but if you can get close enough, it's really cutting out the X years training necessary to get to that point.
>it's really cutting out the X years training necessary to get to that point.
This is exactly my point. With repetition and solid design foundation comes the intuition what is the right direction towards the accomplishing of the given task.
Some will say the design is a subjective, I would argue that designers' role is to move towards objectivity and away from the idea of "personal taste".
That's why I give a link to the works of the master in this craft.
This is exactly the same argument with the Copilot case. Is it capable to give some "boilerplate" solution - yes. Is this solution mediocre at best - yes.
This is most certainly already happening. I find it kind of annoying not to know with certainty whether or not I'm engaging with a Genuine Human(TM) or not.
I'm unsure if it's confirmation bias, but I find myself noticing weird abberations in online comments that don't seem to be ESL related. (edit: it's probably just mobile swipe typing at play)
Your "real question" ultimately resolves itself because the moment the novelty wears off (and it will happen very fast), nobody will be interested in chatting with "robots".
Yep. Interesting times ahead of us:) How we will be able to tell the difference?
QR Code/Genetic sample government approved app for human verification?
And what when people are certain that the machines are better in everything, who will want to chat, listen to music or watch paintings from the "lame" humans, when the robots will be the ultimate solution for every human need?
As the tech stands today, mediocre artists, designers, writers and content creators are likely going to be replaced entirely with AI.
I imagine it would make it very easy to “seed” a website or a platform with initial “users” and content.
I also imagine it will be (and likely is already) being deployed to create the impression of popular support (or lack thereof) of a politician, business or policy.
What an intelligent and educating response. My comment may come as salty, but if you make an effort to visit the linked gallery, maybe you will have more "fresh perspective":)
If someone wants to pay a lot of money to a professional to create their brand identity that option is always there.
If someone else just needs something simple and passable there is Dalle.
And I’m sure there is every option in between where someone can use Dalle as a starting point and pass it to a pro, or a pro would even use Dalle as a way to brainstorm options.
Dalle is a tool that has empowered everyone. It shouldn’t be seen from a stereotypical luddite perspective as in your first post.
I actually tried doing something similar with dall-e mini for one of my projects but the results were bad. It was especially struggling to draw the octopus' limbs. It's impressive to see how much better dalle 2 is at the same task, even if the results still aren't good enough for professional use.
I had no idea that could you do the variations or the brush stuff. Maybe I’m just glossing right over it? But that seems to give the tool more utility. I just try a phrase and I either like it or I don’t.
The fact that you can integrate on it seems to make it much more useful.
Yes! A really cool way to work with it is to generate a bunch of images, arrange them on a transparent canvas (i.e. in Affinity Designer), and then ask Dalle to fill in the gaps.
For example see here[0], where I've combined a picture of a flying whale, a tardigrade in space, and a bunch of flying turtles.
That is neat. So are commas an official way to blend different schools of thought together for the image? Is there any documented way it’s supposed to work? Like [main subject], [art style], etc? Or is it something you picked up from trial and error?
But overall, you have to think about the context Dalle has seen similar images in the training set. If it's seen them on an art sharing site, then it's probably good to mention such sites and tags it could hypothetically have there. Or if it's more like a photo in an article, think about what could be written about it in the article.
You definitely glossed over it - the Dall-E 2 homepage has the three main features of the program with examples (Image Generation, Edits to Existing Images, and Variations of the image)
Very cool -- Looks like we started blogs at the same time using the same stack, down to the theme and even topic in part!
How do you have the multiple figures arranged in a div in markdown -- Is that using tables?
I also didn't want to be tied to a CLI so I write all my markdown files in PCloud (because Google Drive is an ass) and have a webhook button on my phone that grabs them all and deploys.
Also been very happy with https://typora.io/ which has pasted image settings to move them to a folder in the file's directory.
My blog source code is hosted on GitHub[0] and deployed to GitHub Pages. Everything is done automatically by GitHub Actions.
You can see the image arrangement code in the source of the article - it's just rawhtml with inline css. A very ugly approach, but it works.
For the images, I just changed the download directory of my browser for the time I was writing the article so that it put the images into the right folder automatically.
Incidentially you can ask DALL-E 2 for "vector art" and it'll comply, with good enough separation that it can be traced with something like Inkscape into true vectors.
You can also ask for "black and white vector art" to limit the color palette.
It's not as simple as "take a bitmap image and make it a vector". Yes sure, they'll vectorize it, but it'll look bad, through no fault of theirs. When creating a good looking vector image, you generally need to take into account it being a vector from the beginning of the design process.
Like many I am still waiting for my access to be granted to it. I can understand them slowly opening it up more due to apparent resource constraints.
This does make me wonder if it would be feasible in the future to run these kinds of solutions on your PC, even with pretrained models. Or will these AI solutions generally trend towards being hosted in the "cloud" as consumer PC will never catch up in required resources for them?
This simply feels like how everyone iterates on a "google search" for something.
You search the way you think you should at first, and don't get what you want.
But that search informs you of the "terms of the domain" in which you're searching.
So you then refine your search to include those terms, and iterate until you find what you were looking for in the first place, but weren't an expert in (SME)
The more I see from this, the more I think it's not unreasonable to have AI write code. "Create a user signup and authentication form with RoR that uses 2FA based on phone number", and it gives you several to choose from. You pick one then start refining the requests.
Also, perhaps it can be smart enough to ask questions, like "What database should this be written for?" in the above example.
I wish that Dalle produced a single image for a prompt. I often need at least 3-4 iterations of a prompt to get what I'm looking for, and I wish it didn't make 4+ images per prompt. Too expensive.
> To be completely honest, I would prefer something slightly simpler with less complex shapes, but I failed to persuade Dall-e into generating that for me. Moreover, I really am content with this logo.
well, that's pragmatic! I think they should go back into their image editor and simplify it themselves though
Cant signup for the waiting list, https://labs.openai.com/waitlist. Our services aren't available right now
We're working to restore all services as soon as possible. Please check back soon.
Looks good for ideation. Could potentially be more useful for an agency or creative professional building logos. They can make vector art off of promising mock-ups. Also most of the generated images need to be simplified for sake of a logo, but a professional can do this better.
IANAL but my understanding is the ability to copyright the output from something like DALL-E 2 is questionable at best, due the lack of human authorship.
(See "Monkey selfie copyright dispute" on Wikipedia for more info.)
This automates 50% of a modern tech company, now you just need to automate the code generation, which seems already good enough to be on par with modern tech companies. Seems like a manage type can run his entire tech company himself now.
It is not even close to what Ironov does. More like a tech demo. Ironov outputs a complete brand book and it's interface is set up for exploration and logo refinement.
Next step would be to get Dall-E to generate web designs based on a few preferences. “Give name a Scandinavian web design that looks like IKEA’s web site but with blue and black as the primary colours.”
Inspired by this I spent all my free credits trying to generate a QSL card for ham radio. I got pretty close but I think I have to accept I'm just not that good at making art, even with a great AI :)
Very impressive. What amazes me is how closely the images match the prescription given. How long did a typical iteration take? How long did the total process take?
Not sure about "iteration". I mostly did a lot of experimenting.
If you have something like "let's add a helmet to this", then that's basically 5 minutes to a good result.
The whole process took a few hours if I remember correctly, with the main hurdle being to come up with the phrase for sensibly laid-out pictures (the circle background). It went quite quick from then on.
Even more impressive then because to generate that many sketches by hand or using software without this prescriptive generator would take many days or even weeks.
A couple of years ago there was a list of jobs that were going to be in danger from being automated in 10 years, I don't recall if designer would be on the list but it looks as though that moment has gotten significantly closer.
Mods: I see the title got the purpose of the logo edited out, but I think at least adding "a logo for my Open Source project" would be a much better title.
image generators are cool, but there's no shortage of them (midjourney & running your own on collab) and dalle2 has nonsensical bans (why does "pepe" go against the content policy?)
Open-ai has nonsensical censorship. Dalle might be popular right now, but open-ai won't survive if they keep up their ridiculous attempts at trying to control culture. I've already got something running on collab, new models are coming out, and midjourney just got a v3 update that blows dalle2 out of the water.
If a neural network trained on data sucked off of the internet constantly draws frogs with a halo of swastikas when you ask for “Pepe” then maybe there’s a problem that needs solving there.
I was disappointed to see this on their site. I had a pretty good idea of doing a sort-of online art installation by grabbing crime data near me from local web sources and having images automatically generate of those crimes, but unfortunately many of those crimes seem to be too violent for their filter.
I can't get anything good from DALLE-2. It seems so fcking stupid. Whatever I try, it gives me total BS, sometimes it just refuses to generate anything complaining about ToS violation.
With DALLE 2, I’ll pretty much never hire a graphic designer again to make any kind of logo. Good riddance, I’m sick and tired of their pretentious justifications for charging upwards of $500-$1k or more for simple logo designs.
Trivializing the work of people outside of one's profession while giving more importance to one's own is as old as the human civilization. There are plenty of people cursing software developers right now for pulling six digit salaries in exchange for typing some dumb text on a screen.
Ah man, I do like what Dall-E is doing right now, i'm curious about the possibilities, even thinking of using it as a start for digital art and then manipulating the output in photoshop, i'm fairly good/creative at that manipulation side and I like what I can do there but I am a terrible artist, so this is good to get the starting point.
However as it gets better, even that won't be needed and I'm concerned then what that means for the average person, or for me trying to get my skills up so i can increase income, only for it to get wiped out by AI at some point in the future.
I speak under correction here, but you won't have teams; you will just do it yourself. Like cad/cam software got rid of a LOT of machine drafters and the designers just became responsible for outputting the finished drawings. You'll be the entire team.
I think it makes much more sense for simple illustrations for articles, presentations and books ("pencil sketch" style). For logos, especially since you'd usually want simpler shapes, less detail, with a lot of readability, I'd go pay an artist if it was for a company I was building.
Heh, and you don’t think we won’t train AIs specifically for drawing logos where not only can you specify features that you want but even the demographics you want it to appeal to based on mass collection of data.
For logos you want specifically a design that will work well in black and white, and you want assets that are vector art. At the point that AIs can produce that it's worth revisiting for logos, but I'd bet that's probably more on a "many years from now" schedule.
I’m fascinated by how much this is exactly like working with a human artist who doesn’t really understand the domain that you are wanting to represent with an image. Iterate, iterate, iterate.
It seems like the most valuable thing this could do is get some of that early exploration out of the way faster and easier than a human can do it, get to two or three concepts that feel like they’re in the neighborhood, and then let a human expert take over and turn it to something final quality. That’s pretty cool.
Agreed.
At the end of the article I also described a bit how I would see the evolution of such a tool, and it looks like we're thinking very similarly.
---
Though I think the real breakthrough will come when Dall-e gets 10-100x cheaper (and faster). I would then envision the following process of working with it (which is really just an optimization on top of what I’ve been doing now):
1. You write a phrase.
2. You are shown a hundred pictures for that phrase, preferably from very different regions of the latent space.
3. You select the ones best matching what you want.
4. Go back to 2, 4-5 times, getting better results every time.
5. Now you can write a phrase for what you would like to change (edit) and the original image would be used as the baseline. Go back to 2 until happy.
I see this happening in all areas. Everything would be prompt-driven.
Do you like this? What about this? You simply nod or reject the solutions that you don't want.
Pretty soon somebody's expertise and experience is not going to be enough to continue paying them what they used to get before this magic blackbox appeared.
One day enterprises will realize they can just outsource that expert who's been reduced to simply typing prompts and nodding yes or no.
I am worried that the middle class is rapidly disappearing. We will own nothing and be happy seems quite ominous. The question is then what field is safe from advancements in AI?
The only field I can think of is doctors, lawyers, executives, buy-side money managers. Even their jobs will be partially automated but it will be safe as long as they generate revenue.
You don’t need nodding or really any conscious reaction I think. It should be possible to have some camera directed at face hooked up to another AI that catches slight changes in pupil dilation or other changes imperceptible to naked eye and registers when something looks interesting to the user. You can then quickly show a stream of variations and pick the tagged ones and use them to improve the guesses. I imagine something like this might one day become a preferred way of interacting with computers/AI.
But, if everyone's jobs are automated, nobody is making any money, so nobody has any money to pay doctors, lawyers, executives, money managers, etc. You would think that if these types were thinking rationally, they would be fighting to expand the middle class so more people can pay for their services.
In the past, eliminating humans from one set of jobs has been balanced by a new set of opportunities for humans in different jobs. Usually, the new jobs are more valuable.
That's not utopianism. The new jobs can't always be filled by the people kicked out of jobs. It really sucks to be them.
But it does mean that it's not irrational for people to want to automate other people's jobs. The net amount of stuff generated increases, rather than decreases.
This pattern may not last forever. There's already some thought that we've generated more than enough stuff to guarantee a decent standard of living to everybody (at least in the developed world) without working, and plenty more for luxuries if people choose to work. Even if we haven't reached it, we appear to be heading in that direction sooner rather than later.
That may cause a radical re-think at some point. And it won't be seriously delayed by making sure cartoonists have jobs.
Jobs are plentiful as long as wealth is well distributed.
In the past, fast automation has led to badly distributed wealth, and job loss. This situation has lasted until the unemployable people died off (yep, that was part of it), and enough wealth was redistributed through violent means.
Today we know better, and have really no reason to repeat the violent means of our previous revolutions. But it's really looking like the people in power want to repeat them.
> enough wealth was redistributed through violent means.
there were no instances of violent redistribution of wealth that ended better for the average person than before. Only that a different group of people ended up with wealth.
Automation makes stuff cheaper, even for people who didn't obtain any of the financial wealth via redistribution - because there's more than just financial wealth that get created with automation. New availability of services and goods (think internet of today - this is a wealth that couldn't have existed before, and one can benefit from it even if they are poor today).
> enough stuff to guarantee a decent standard of living to everybody
It's not a zero sum game. There's still growth in us. We'll go to space and expand 1000x more, the space has plenty of resources, and humans will have jobs working together with AI.
> There's still growth in us. We'll go to space and expand 1000x more, the space has plenty of resources, and humans will have jobs [..]
Q: Am I the only one thinking of Golgafrinchan Ark Fleet Ship B?
We'll have to automate childcare to make that happen. Otherwise, the birthrates of the rest of the world will follow the countries with the highest standards of living on a wild plunge into unsustainability.
> Everything would be prompt-driven.
Just like in Star Trek. They really knew what the end goal was didn't they.
> enterprises will realize they can just outsource that expert who's been reduced to simply typing prompts and nodding yes or no
Tbf a program averaging the market for a fact gives better returns than most of the financial industry, yet they still exist. Even if we can automate something doesn't mean we will, usually for pointless emotional reasons.
But on the other hand it's hard to say if in a 100 years humans will still be employable in any practical capacity for literally anything.
>Pretty soon somebody's expertise and experience is not going to be enough to continue paying them what they used to get before this magic blackbox appeared.
Every art director at an ad agency just shrieked!
I doubt it, because the process of thinking of phrases to feed dall-e is really the hard bit.
This is ok for a logo like this where it’s fair to say the base level expectation is not super creative. This logo is cool, but it doesn’t really stand out or make the product ver distinctive. If I am running a hobby or OS project that’s fine, but if I was investing a lot in sales/marketing then paying a real artist to make something interesting and novel is a rounding error.
> This logo is cool, but it doesn’t really stand out or make the product ver distinctive. If I am running a hobby or OS project that’s fine, but if I was investing a lot in sales/marketing then paying a real artist to make something interesting and novel is a rounding error.
Q: Are there really logos out there that are "interesting and novel" and that "stand out or make the product [..] distinctive"? Which ones?
EDIT: (perhaps more importantly) are there interesting, novel, distinctive logos that actually contribute to profitability?
tbf I think when it comes to big company branding it's the opposite.
A lot of GPT iterations of the design has left the article author with something which is quirkier than your average logo, but also looks like clipart and probably doesn't scale up or down well or work in monochrome. Which is fine for OSS. (He might get more users from blog traffic about using GPT-3 to design his logo than he ever could from any other logo anyway)
But when it comes to bigger companies, the design agency are the people that sit in meetings with execs persuading them that a well chosen font and a silhouette of a much simplified octopus will work much better ("but maybe the arms could interact with some of the letters etc etc, now lets discuss colours). The actual technical bit of drawing it is the bit that's already relatively cheaply and easily outsourced, and plenty of corporate logos are wordmarks that don't even need to be drawn...
Doctors are very vulnerable. Most of dermatology is simple pattern recognition. I can easily see AI lawyers beating human lawyers in litigation, too. An AI lawyer will have read every single case and know the outcomes, and can fine tune arguments for specific parameters like which judge etc.
> Most of dermatology is simple pattern recognition.
I have a few qualms with this app:
1. For a Linux user, you can already build such a system yourself quite trivially by getting an FTP account, mounting it locally with curlftpfs, and then using SVN or CVS on the mounted filesystem. From Windows or Mac, this FTP account could be accessed through built-in software.
2. It doesn't actually replace a USB drive. Most people I know e-mail files to themselves or host them somewhere online to be able to perform presentations, but they still carry a USB drive in case there are connectivity problems. This does not solve the connectivity issue.
3. It does not seem very "viral" or income-generating. I know this is premature at this point, but without charging users for the service, is it reasonable to expect to make money off of this?
What on earth are you referring to? I assume it’s some sort of implicit joke but I don’t get it :)
Edit: Ahh, it’s the Dropbox comment of HN fame. Never mind.
This workflow reminds me of a generative art program from the early 1990s, but I just can't remember its name. It was a DOS or Windows program that had a very curvy, fluid GUI with different graphics sliders. It would show you some random tiles and you choose one to guide the algorithm's next generation of tiles.
Kai's Power Tools.
I wonder if Kai Krause lurks here at HN. I'd love to know how he's doing. Apparently he's still living in his castle, which he bought around 1999 [0].
Some-when in the 00's I read an article about him that he was putting advanced networking stuff into the castle and had the intention to start something like a "think-tank" (doesn't really fit it, but I don't know what I'd call it) where he and others would hang around and code stuff.
I found the article [1] from July 2002, "Lord of the Castle Kai Krause presents Byteburg II".
> So that 's Kai Krause's long-cherished plan: Now the software guru has finally opened a center for founders and developers from the IT and software industry in Hemmersbach Castle near Cologne -- the Byteburg II
I really wonder what he's doing to these days. His plug-ins were legendary, as well as the User Interface for Bryce [2]
[0] https://de.wikipedia.org/wiki/Burg_Rheineck
[1] https://www.heise.de/newsticker/meldung/Schlossherr-Kai-Krau...
[1, google translate] https://www-heise-de.translate.goog/newsticker/meldung/Schlo...
[2] https://en.wikipedia.org/wiki/Bryce_(software)
Your comment really intrigued me to google this interesting person I had never heard about before. This may well not be used to you, but Kai has a not-a-blog blog that I stumbled upon on here http://kai.sub.blue/en/sizemo.html.
Some really interesting reads. I especially appreciated his articles on the passing of Douglass Adams (apparently a close friend of his!) and Then vs Zen.
F’n LEGEND! I spent hours per day twerking his filters for my thesis animation
Hunh, I’ll be in that neck of the world next week. Need to look into this…
Please follow up, and tell us - even a Show HN
https://news.ycombinator.com/item?id=27288454
Love him or hate him (and I do both), Kai was all about cultivating his adulating cult of personality and dazzling everyone with his totally unique breathtakingly beautiful bespoke UIs! How can you possibly begrudge him and his fans of that simple pleasure? ;)
In the modest liner notes of one of the KPT CDROMS, Kai wrote a charming rambling story about how he was once passing through airport security, and the guard immediately recognized him as the User Interface Rock Star that he was: the guy who made Kai Power Tools and Power Goo and Bryce!
Kai's Power Goo - Classic '90s Funware! [LGR Retrospective]:
https://www.youtube.com/watch?v=xt06OSIQ0PE&ab_channel=LGR
>Revisiting the mid 1990s to explore the world of gooey image manipulation from MetaTools! Kai Krause worked on some fantastically influential user interfaces too, so let's dive into all of it.
>"Now if you're like me, you must be thinking, ok, this is all well and good, sure, but who the heck is Kai? His name's on everything, so he must be special. OH HE IS! Say hello to Kai Krause. Embrace his gaze! He is an absolute legend in certain circles, not just for his software contributions, but his overall life story." [...]
>"... and now owns and resides in the 1000 year old tower near Rieneck Castle in Germany that he calls Byteburg. Oh, and along the way, he found time to work on software milestones like Poser, Bryce, Kai's Power Tools, and Kai's Super Goo, propagating what he called "Padded Cell" graphical interface design. "The interface is also, I call it the 'Padded Cell'. You just can't hurt yourself." -Kai
But all in all, it's a good thing for humanity that Kai said "Nein!" to Apple's offer to help them redesign their UI:
http://www.vintageapplemac.com/files/misc/MacWorld_UK_Feb_20...
>read me first, Simon Jary, editor-in-chief, MacWorld, February 2000, page 5:
>When graphics guru Kai Krause was in his heyday, he once revealed to me that Apple had asked him to help redesign the Mac's interface. It was one of old Apple's very few pieces of good luck that Kai said "nein"
>At the time, Kai was king of the weird interface - Bryce, KPT and Goo were all decidedly odd, leaving users with lumps of spherical rock to swivel, and glowing orbs to fiddle with just to save a simple file. Kai's interface were fun, in a Crystal Maze kind of way. He did show me one possible interface, where the desktop metaphor was adapted to have more sophisticated layers - basically, it was the standard desktop but with no filing cabinet and all your folders and documents strewn over your screen as if you'd just turned on a fan to full blast and aimed it at your neatly stacked paperwork.
The Interface of Kai Krause’s Software:
https://mprove.de/script/99/kai/index.html
>Bruce “Tog” Tognazzini writes about Kansei Engineering:
>»Since the year A.D. 618 the Japanese have been creating beautiful Zen gardens, environments of harmony designed to instill in their users a sense of serenity and peace. […] Every rock and tree is thoughtfully placed in patterns that are at once random and yet teeming with order. Rocks are not just strewn about; they are carefully arranged in odd-numbered groupings and sunk into the ground to give the illusion of age and stability. Waterfalls are not simply lined with interesting rocks; they are tuned to create just the right burble and plop. […]
>Kansei speakes to a totality of experience: colors, sounds, shapes, tactile sensations, and kinesthesia, as well as the personality and consistency of interactions.« [Tog96, pp. 171]
>Then Tog comes to software design:
>»Where does kansei start? Not with the hardware. Not with the software either. Kansei starts with attitude, as does quality. The original Xerox Star team had it. So did the Lisa team, and the Mac team after. All were dedicated to building a single, tightly integrated environment – a totality of experience. […]
>KPT Convolver […] is a marvelous example of kansei design. It replaces the extensive lineup of filters that graphic designers traditionally grapple with when using such tools as Photoshop with a simple, integrated, harmonious environment.
>In the past, designers have followed a process of picturing their desired end result in their mind, then applying a series of filters sequentially, without benefit of undo beyond the last-applied filter. Convolver lets users play, trying any combination of filters at will, either on their own or with the computer’s aid and advice. […] Both time and space lie at the user’s complete control.« [Tog96, pp. 174]
METAMEMORIES:
https://systemfolder.wordpress.com/2009/03/01/metamemories/
>Anyone who has been using Macs for at least the last ten years will surely remember Viewpoint Corporation’s products. No? Well, Viewpoint Corporation was previously MetaCreations. Still doesn’t ring a bell? Maybe MetaTools will. Or the name Kai Krause. Or, even better, the names of the software products themselves — Kai’s Power Tools, Kai’s Power Goo, Kai’s Photo Soap, Bryce, Painter, Poser… See? Now we’re talking.
Macintosh Garden: KPT Bryce 1.0.1:
https://macintoshgarden.org/apps/bryce-1
>Experienced 3D professionals will appreciate the powerful controls that are included, such as surface contour definition, bumpiness, translucency, reflectivity, color, humidity, cloud attributes, alpha channels, texture generation and more.
>KPT Bryce features easy point-and-click commands and an incredible user interface that includes the Sky & Fog Palette, which governs Bryce's virtual environment; the Create Palette, which contains all the objects needed to create grounds, seas and mountains; an Edit Palette, where users select and edit all the objects created; and the Render Palette, which has all the controls specific to rendering, such as setting the size and resolutions for the final image.
MACFormat, Issue 23, April 1995, p. 28-29:
https://macintoshgarden.org/sites/macintoshgarden.org/files/...
https://macintoshgarden.org/sites/macintoshgarden.org/files/...
>He intends to challenge everything you thought you knew about the way you use computers. 'I maintain that everything we now have will be thrown away. Every piece of software -- including my own -- will be complete and utter junk. Our children will laugh about us -- they'll be rolling on the floor in hysterics, pointing at these dinosaurs that we are using.
>'Design is a very tricky thing. You don't jump from the Model T Fort straight to the latest Mercedes -- there's a million tiny things that have to be changed. And I'm not trying to come up with lots of little ideas where afterwards you go, "Yeah, of course! It's obvious!"
>'Here's an easy one. For years we had eight character file-names on computers. Now that we have more characters, it seems ludicrous, am historical accident that it ever happened.
>'What people don't realize is that we have hundreds more ideas that are equally stupid, buried throughout the structure of software design -- from the interface to the deeper levels of how it works inside.'
Please don’t just repost walls of copy-pasta
How interesting! Thanks for posting.
This was some really interesting reading, thank you internet stranger :)
+1 what a great program
Given the stochastic way it works I wonder how the randomness is seeded for a certain phrase.
In other words, if another person needed a logo and used the same phrase how long on average until they get a duplicate of your image?
The model starts from 64x64 8bit RGB image of noise (random pixels) so technically 1 in 3_145_728 (64x64x256x3) but most will probably be very close to each other as the color difference won't be that much. The image is then further upsampled by two other models which will change some details, but shouldn't affect the general composition of an image.
Maybe I'm wrong, but with these diffusion models there is randomness in every sampling step too not just in the initialization and they can have 1000 steps to generate a single image.
Ah good point, this would introduce more variation if the initial noise is close, but if the initial noise is exactly the same it probably means it was initialized with the same seed and the rest of the generation will be the same since the random algorithms are deterministic.
Since the image is RGB 1024x1024, and the random seed is noise (as it is for diffusion models), I guess it would be quite long.
It will get cheaper. On 5 years it will run on your phone
Yeah, my first thought was "Ok, but you are going to need to involve a graphical artist to actually really make use of that logo". Like you probably want a vector version and you definitely need simplified versions for smaller sizes but then I stopped and realized how amazing this actually is. It "saved" (I know, it cost $30 but that's a steal for something like this) all the time and money you would have paid for iteration after iteration and let the author quickly hone in on what they wanted.
As someone who is incredibly terrible at graphic design but knows what they like this could be a game changer as iterations of this technology progress. I can imagine going further than images and having AI/ML generate full HTML layouts in this iterative way where you start to define your vision for a website or app even and it spits out ideas/concepts that you can "lock" parts of it you like and let it regenerate the rest.
I'm not downplaying designers role at all, I'd still go to one of them for the final design but to be able to wireframe using words/phrases and take a good idea of what I want would be amazing, especially for freelance/side-projects.
Honestly though the hard part is the actual design which is already done here. Learning to vectorize a raster is something that can be done in a weekend with Inkscape, there's no reason to involve an actual graphics designer with this anymore.
> Learning to vectorize a raster is something that can be done in a weekend with Inkscape, there's no reason to involve an actual graphics designer with this anymore.
If you lined up 100 resulting images, 99 from weekend beginners and 1 from an actual artist. I guarantee you that you would pick out the artist every time.
It might be simple to trace over an image but you are probably better getting an artist to spend 2 hours on it, it will most likely look better than 2 weeks of tracing.
Time value of money. The most optimal use of money and time would be getting the ML to iterate until you have the finished product, then get a designer to vectorise it and fix it up. That way you pay the designer for one iteration and spend all the time you would have spent iterating with the designer iterating with the ML model instead.
I think you might be underestimating how much work goes into the last mile of a design. A lot of refinement work goes into typography in particular, a domain Dall-E isn’t yet proficient in at all.
Nice ideas, great enthusiasm.
I think your art/design/craft is pretty good. Some people use pencils, some use Adobe products, you have gone out there and tried the new Dall-E medium.
Glad you thought out the usage, I am sure that when the novelty wears off that you will have that neat-as-octocat logo sorted out.
I appreciate that you appreciate the value that highly skilled designers bring to a product with their visual expertise.
However, I would like to see you A/B test the Dall E logo versus the winning designer logo. You could show odd IP addresses one logo and even addresses the other.
I think the designer would edge the robot for what you need (a logo), however, the proof is in the pudding and conversion rate.
Plus there is no reason why someone couldn't build a specialised AI model to do vectorisation and another to generate simplified versions of vectors.
People are already doing by combining DALL-E 2 with gfpgan for face restoration. So there may be a role in understanding how to combine these tools effectively.
Yes! It gives powerful tools for someone with a concept to get much closer to visualization of their idea.
DALL E 2 is like a low or no-code tool in that way.
The outcome may not be a "finished" product, especially as viewed by a professional designer (or web dev). However, its a heck of a lot better than a tersely written spec.
And in some cases, the product will work well enough to unblock the business, get customer feedback and generally keep things moving forward.
I think this is more powerful than a simple exploration tool. It took the author a long time to find a query format that generated logo-like images. Once they had that part down, they were quickly able to iterate on their query to find an image they liked. They were even able to fix part of the logo using the fill-in tool. I'm not sure why you'd bring a human into the mix, especially if you're on a budget.
Ehnnnnnnnnn...
An experienced human designer, right away, is going to ask how you want the logo to be used. That's going to have a major impact on how it's designed.
So yeah, this may be like working with a doodler, but, as the author intimated, this is far from an ideal experience in getting a professionally designed logo. This is more like "Hey, you, drawing nerd, make this thing."
Nevertheless, astonishing technology in its own right.
Nah, people will leave out the professional. The same wild west grab whatever you can, steal, plunder to the detriment of artists, writers etc. And when the legislation arrives it will be already too late, accidentally.
Why should there be legislation? Do you want to restrict what people can do, just to force them to employ artists and writers? We could also forbid people from filling the gas tanks in their own cars, to protect the job of gas station attendant, but nobody wants to live in New Jersey.
you remember the concept of dumping, i.e., flooding a market with below cost product to drive out competing businesses? This is dumping for creatives.
editing: not that it's intentional, but these things will have the same effect; way too much product even for creative works. No one will be able to make money off the product but the tools.
Is it below cost though? It might just be very cheap to run.
"Why should there be legislation?" Lol. Read the uber files.
This blog post proves that Dall-E 2 will not make human taste and design ability obsolete. The final image he ended up with is a lot uglier and more complicated than most of the intermediate steps. I think generative art AIs will have a similar effect on design as compilers have on software development, and will not put artists out of a job.
Not trying to be a luddite and/or vehemently defend the noble profession of nuanced graphic design, BUT...
Those iterations suck. I'm not worried for my colleagues and I.
That being said! Many, MANY clients have questionable taste, and I can, indeed, see many who aren't sensitive to visuals to be more than happy with these Dall-E turd octopus logo iterations. Most people don't know and don't care what makes good graphic design.
For one thing, that final logo can't scale. For another, the colors lack nuance & harmony. The logo is more like a children's book illustration, and not something that is simple, bold, smart, and can be plastered on any and all mediums.
Just my 2 cents.
I bet in another 10-15 years, though, things might get a bit dicier for fellow graphic designers/ artists/ illustrators, though, as all this tech gets more advanced.
I feel like you look at this too much as a creator rather than the customer. The logo may be not optimal for every medium, not have a great palette, not have the feel you would give it... But the author is happy with the result, so who are we to say it's bad/good? Paraphrasing @mipsytipsy "colour harmony doesn't matter if the user isn't happy". (yes, I get the nuance where it's part of the designers job to explain why certain design elements are more beneficial, but the general point stands for "i want a logo for my small project" case)
Why is the creator the only one that needs to be happy? I assume they created that project to be used by others and to possibly monetize it. That sounds more like the users / clients are the ones that are supposed to like it...
I never understood this logic, where the creator of something does something seemingly stupid and people are like "Well, don't use their project then if you don't like it". Instead of constructively calling the problem out, so the creator can try to make it better.
If my logo sucked, I'd like people to please tell me...
> Why is the creator the only one that needs to be happy?
Because they define what the success is. If their goal is to make money they may want a logo which is the closest to optimal for getting clicks. If they want a private project, they may want it to be fun. And many other scenarios... You're welcome of course to do constructive criticism, but in the end it's up to them if they want to apply it.
You're right. A million shitty logos are created every day, and for the vast majority of them, they will serve their purpose. And contrarily, there will always be a marketplace for companies/entities who want a logo that has purpose, novelty and intelligence behind its design. I definitely see a chasm between an AI-catered subclass and human-catered superclass forming.
Weirdly, with the advent of AI, we might start to see exactly what it is that makes human beings special.
I think a tool like this might be good to help clients get through a few ideation phases on their own prior to showing up to the first discussion with branding / graphics / design professionals. At least it might get them closer to understanding the impossibility of their 7 perpendicular red lines requirement.
It certainly reduces the # of designers necessary. Just because it doesn't obliterate all of the designers doesn't mean the profession isn't at risk. Today fewer data viz experts are hired despite the proliferation of data, since we now have Tableau, Looker, etc
A more obtuse example, how many lift operators do you see today?
> ...the impossibility of their 7 perpendicular red lines requirement.
For those who do not know the reference: https://m.youtube.com/watch?v=BKorP55Aqvg
i am going to be extremely butthurt if clients start showing up and asking me to finish an ai's homework for them.
I think this is valid criticism and feels similar to restaurants that don’t put pepper on the table because the chef considers the food to be seasoned to the intended level before it leaves the kitchen. Some customers may be turned off by that level of pride, but other customers are willing to pay a premium for that level of pride to be shown by their chef.
So what you are saying, the ai hasn't yet grown up to be boring, clean, simple adult like the western scandavian school.
That's some strong copium you got there, can I have some of what you're smoking?
Ultimately the average person (who is likely the target audience anyway) won't notice anything wrong with most of those iterations and given that they're basically free in comparison would make me worried. I wouldn't be surprised if they manage to make it output svgs soon.
I agree with you.
I will say, though, I think DALL-E has opened up a new market for artists. I've gone to freelance graphic designers before, and been generally happy with the results, but it's pricey. So pricey that I honestly can't justify it for a new project I intend to sell or for an open source project I don't expect to make money from. It's usually much more cost-effective to even hire lawyers or even UI/UX people.
If I were an artist, I'd be experimenting with DALL-E, trying to run my own pirate version and learning everything about it. An artist empowered with DALL-E could give quick options to a client, iterate with them quickly, and test out some ideas before making the final work product. I'd guess a good artist who made good use of DALL-E could get a project done much faster and cheaper, and this would likely mean a lot more people hiring artists (if I could spend $100-200 for high-quality assets within a few days rather than $1000-2000, I'd gladly hire artists frequently).
I'm sure this will make some artists feel cheapened, but the reality is that art & technology have always evolved in dynamic and unpredictable ways. ML being essentially curve-fitting means that genuine inspiration and emotion is still far beyond our capabilities today, and that, ultimately, these models will only give us exactly what we ask for. A good (human) artist can go beyond that.
EDIT: Also, I agree with your assessment of the "work product," if we can call it that. I was unimpressed with the iterations, and especially the final product. I guess it's good the product is an open source tool. Nothing about the generated logo helped me understand what the OctoSQL tool did. Honestly, the name (which also IMO isn't excellent) is much more evocative than that logo. Why is the octopus wearing a hard hat? Why is it grabbing different colored solids? I guess the solids are datasets? But then the octopus is just exploring them? No thanks.
It's kinda funny that your main complaint about the final logo is that it doesn't tell you much about what the project does.
I can't think of a single well known logo that is even remotely close to what a company's product is. Photoshop, Firefox, Chrome, Microsoft, Facebook, Apple, Netflix, McDonalds, Ford, Ferrari, Samsung, Nvidia, Intel, RedHat, Uber, Github, Duolingo, AirBnB, Slack, Twitter, IntelliJ, Steam.
I guess the Gmail logo does tell you it has something to do with mail though, so I did find one example.
Most of those examples are company logos, and the branding for the company is different than the branding for its products.
So whereas Ford's brand is just a name, "Mustang" has a logo that really does tell you something about the car. You kind of understand when you see the galloping horse what it's meant to do.
Intel brands its CPUs with the name inside a square, which is colored to resemble (abstractly) a CPU.[0]
And Photoshop once had a logo that communicated what it did.[1]
As a brand becomes more established, it tends to be more abstract. Whereas Starbucks was once an elaborate siren (I interpreted it to be the siren call of espresso), details have been simplified over the years.[2] This is similar to the Photoshop magnifying glass logo becoming "Ps".
After the Apple I and Apple II, Apple sometimes used apple varieties (plus Lisa) to brand it's products (e.g. Macintosh, Newton). However, this largely stopped in the late 90s when Steve Jobs returned. Macintosh was shortened to Mac, and 'i' was prepended to various product names. Most new ones were descriptive e.g. iPod, iPhone, iPad, Apple Watch. The computers have retained "Mac" in the branding, along with "book" for notebooks (a convention predating Steve's return). The logos for all of these are just the names of the products typeset in its own San Francisco font; whenever Apple appears in a product name, the Apple logo is used instead.
So, yeah, I think it's reasonable to communicate what a product does or why a project exists with its logo. I didn't really see that w/ OctoSQL.
EDIT: I should also address Firefox & Chrome.
Firefox started as Phoenix (i.e. rising from the ashes of Netscape Navigator/Mozilla). Phoenix had a trademark conflict, so was renamed Firebird. This also had a conflict, and Firefox was chosen after. In the Zeitgeist of the early aughts, Phoenix made a ton of sense: instead of extremely bloated chrome around the page as had been prevalent in Navigator and Internet Explorer, Phoenix gave you a tab bar (truly revolutionary), the navigation bad and the bookmarks bar. It was simple and clean, like a reborn Phoenix.
Chrome is interesting because the name is not related to traveling or navigation. It's telling you it's just the container for what you care about. But the logo is a bit more like a sphincter or an all-seeing aperture. I've never gotten the logo for Chrome outside a spyware context, but it has become successful.
[0] https://www.intel.com/content/www/us/en/products/details/pro...
[1] https://logos-world.net/wp-content/uploads/2020/11/Adobe-Pho...
[2] https://miro.medium.com/max/2418/1*tJf7O6FPOmnErngygbBQDQ.pn...
I agree. But I think the key thing is that deciding what phase to feed the system was still the key task. Creative people are unlikely to be out of a job anytime soon, even if they end up using something like Dalle to make quick prototypes.
> Most people don't know and don't care what makes good graphic design.
But isn't the logo created for most people? Does it matter that, you as a designer, think it's bad if most people don't? I see it like modern fashion shows. I look at them and think the clothes are insane and I would never wear them, but obviously other fashion designers think they look good (I'm guessing?).
I do agree that the logo isn't super practical though, it's too textured and won't scale. I would take it to /r/slavelabour or Fiverr and pay someone to vectorize it and see what they come up with.
Even things that are created for most people usually need a professional to make it actually good for regular folks. Just like most people can tell if a song is musically good or not, but would struggle to actually create that themselves. Or they know when a physical thing is easy to use, but they'll struggle to create things themselves that are easy to use.
But the point here is exactly that they don't need to create it, they just need to judge it. They make the AI create the logo and then decide if they like it.
I understand your argument but I don't think that's the problem - the problem is that even most users don't understand what a good logo looks like (even if they like them) the same as users don't know what they want. It's a known fact that you shouldn't ask users of a software how it should be designed because if you'd let them design a software they want it would be shit.
I work in the AI field, but not on image generation.
I don't think it would be technically hard to build a model with current technology which can generate logos with the attributes which you mentioned. You could simply fine-tune a Dalle-E style model specifically on a smaller dataset of logos. This would just take a small dedicated team of domain experts to work on the problem.
I've seen people screenshot logos at low res, save them as jpeg, share them with Whatsapp and put them in A0 posters. With SVG and EPS logos easily available. With detailed guidelines on how to use them. Point them out their fault and still not see anything wrong.
I bet it will happen sooner than 10-15
The thing you’re missing is AI generated content can be refined by AI. If Disney promised their meh looking movie would improve on its own over time, people would be line to it because it’s new, not just streamlined copy-pasted design we see all over media now
Painting the Titanic wasn’t the hard part. The hard part was organizing the process that produced its structure. That’s were AI content is now.
We’re generating the bulk structure pretty competently at this point. Refining the emotional touches will come faster.
I disagree with the analogy you draw (no pun intended). Good creative design is the edge case for a model like this and is naturally much less tractable than getting to this level of design (I’m not a fan).
> I bet in another 10-15 years, though, things might get a bit dicier for fellow graphic designers/ artists/ illustrators, though, as all this tech gets more advanced.
That's a long time. I expect within a decade or two, "AI" should be able to generate an entire animated movie given nothing but a script.
Unless the tech learns to reason, it will never be able to do anything other than recombine and remix prior art. (Which is maybe what many designers already do, but it won’t ever spit out a Paul Rand logo.)
Honestly logos are currently a very low entropy art form, much lower than graphic design which is already quite low compared to many forms of art (obviously my subjective opinion, but I'd like to think I have strong reasons). If anything, I think logo design is one of the first things ai can achieve human parity on. Obviously the style in this post was unorthodox for a logo, so I wouldn't even rule DALL-E out, with the right prompt engineering.
However, once you reach a certain budget, it's much more involved to *choose* a logo that "fits" how the company wants to present itself, than it is to generate candidate logos of sufficient quality. I can assure you that the "many-chefs problem" for a high budget design project is very real, and the major cost driver. You have a mix of "design by committee", internal politics, what designers wants on their portfolios, etc etc.
I was thinking something similar. The editing process is still a human one, and I agree that the one chosen was weaker than a lot of the intermediate choices. It's a matter of taste, obviously, but to me the red ball with a nondescript sketched square around it feels unfinished. The yellow cartoony logos look more finished and professional to me.
Appreciate the feedback!
I'll keep it mind, as I might still end up choosing a different one.
The chosen one is closer to my original vision, but you do have a point that the yellow ones look more polished.
Strongly agree with others here that you skipped better options.
Also, since time immemorial, databases are cylinders and data comes in cubes.
For logo purposes, these are both strong, while the second adds “personality”:
https://i.imgur.com/j6P4Oh4.jpg
https://i.imgur.com/kM23GZV.jpg
I really like the design breaking out of the strong circle, and your hard hat idea was great. That last one could have been your logo “as is”!
Though you could consider replacing the green cubes with cylinders, or simply hand add rubix cube lines to these green cubes to make them data cubes.
https://duckduckgo.com/?q=data+cube&t=ha&va=j&ia=images&iax=...
Thanks for sharing the process!
From what I see we are at the next stage of the logo generation :)
Disagree. Just allow one or two more iterations and it will supersede human abilities. Think ahead. Tech progress won't stop.
The tech will get better, but ultimately there still has to be a human who decides 'that's the one that looks good', which strongly depends on someone's taste and skill in identifying what a good image looks like.
There will probably be less need for designers of 'lower quality' simple images though.
I agree with you, but what if what constitutes good taste is just a subset of things that we’ve seen and liked.
If dale decides what we see, it might become what the next generation likes and considers “good taste”.
This is an interesting conversation. Good taste is what we see and like … but also patterning after people we want to impress / be associated with, is it not?
Taste is very complex: it's hierarchical, social, not fixed, not absolute, not rational, is specific to audience and has irregular overlaps across groups, much of it (all?) derived from human sensation and context-specific situations.
The path to something being considered as good taste is generally not simple: much of it flows through lines of power/desire/moment whose branches are not easy to trace as they're being formed. Much of taste is the hidden "why" which most of us never see.
It's realistic that Dall-E could understand what trends are on the rise, or in good taste … it's much harder to say if Dall-E could create something of originally good taste.
That just sounds like pattern recognition with extra variables. Subdividing people into groups and then analyzing them certainly doesn't sound like a task that a machine will struggle with. Why should the algorithm need to be able to see the hidden "why" when most of us creative types can't see it or define it either? It's just a function of having observed enough people of a certain type. You want to generate something that will impress the people I'm targeting? Just analyze the posts of all my followers on social media. Analyze the content that is "liked" by people in my demographic range and with close proximity to where I live. Analyze the works of creators who belong to my generation and who listen to the same music as me. Do that all nearly instantly and then offer me a selection of options picked from those various methods. I don't expect "good taste" will be hard to conjure up. I already can't tell that a lot of these octopus drawings weren't created by a talented human, and we're still early and unsophisticated in our data analytics.
> has to be a human who decides 'that's the one that looks good'
Assuming the status quo, true. As we evolve our lives around emerging AI tech I think we will at first be the curators and creative directors of AI, but eventually a creative agency will defer to the AI as it knows more about our tastes, market, audience, and the ENTIRE HISTORY of art, design, marketing, tastes, trends, and so on.
Eventually it won't make sense to have a stupid human rubber stamp what the all powerful AI suggests. Just as it does not make sense for Facebook to curate news feeds.
Maybe one day product advertising will look different depending on who looks at it. Pepsi logo "just for you".
Is anyone really happy with AI curated feeds? Besides the company's who make them?
I am! TikTok is amazing and the ads I get on Facebook/IG are for things I often want to buy.
There has to be x - y humans that needs X - Y hours instead of X humans needing X hours. And that is a real risk to the profession
Only if you assume the world demand for raster images is fixed...
I still remember a HN article, might have been a Paul Graham article, from 15 years ago about “Why are all of Trump’s buildings so poorly designed when he can afford the best designers?” It came down the the fact the he personally has bad taste and therefore cannot pick good designers or approve good designs.
That aside, a great use of these tools is to generate N spit-takes of wildly varying styles that you can present to the customer very quickly and very cheaply. Once you pin them down to a particular range of styles you can get down to the carving out the details by hand.
What looks good is a more widely distributed skill. A lot of people can tell you what looks good quite well, very few people can make it.
However, the input might stop.
Right now, the input to DALL-E is all human generated.
What will happen is that DALL-E will generate something "close enough" that gets used and promulgated, so now the input to DALL-E will become increasingly contaminated with output from DALL-E.
We're already starting to see this in search engines where you get clickbait that seems to be GPT-3 generated.
If you can have humans sort the generated images into "good quality" and "bad quality", you can just keep iterating. Our subjective ratings is another score to optimize for.
Moreover, the current Dalle UI already does that.
When you run a phrase, you get four images. Those images will stay in your history, but the ones you like you will save with the "save" button, so that they're in your private collection.
With this, you already have a great feedback system: saved - good, not saved - bad.
I've saved some of the worst images Dalle generated to be able to showcase just how bad it can be sometimes. And then other times the bad image is hilariously bad. They can probably build another layer on top of the feedback system though to filter that sort of thing out.
I would guess your use-case is a statistical anomaly. If most of the images that are saved are saved by people who like them best, which is most likely the case, enough data will erase the problem.
Doesn't the sample size for this have to be very large for it to make a difference? Genuine question.
With semi-supervised learning a small amount of labeled data, can produce considerable improvement in accuracy:
https://en.wikipedia.org/wiki/Semi-supervised_learning
https://towardsdatascience.com/semi-supervised-learning-how-...
Thank you!
Sure, but there are millions of people on the DALLE waitlist, who would happily rate the output for better performance / more credits. The famous ImageNet data set only has 1.2M images.
Why are you framing it like your subjective taste is universal fact? I think the final image is the best.
DALL-E2 and similar are unbundlings: the best artists synergize 1) technical ability with 2) good taste. 1 is the ability to climb a hill and 2 informs the direction of "up", and both take years to develop well.
What's really interesting about this class of AIs is that they unbundle the two and you can play with them independently for the first time.
Train Dall-E on more logos that you like. I can imagine a creative agency purchasing a Dall-E 2 instance and training it up on a model specific to the work and clients they have ongoing.
If nothing else, inspiration is just a click away. No more searching for ideas, just talk to the AI and it will pump out numerous ideas for you.
Will DALL•E 2 make human taste obsolete? No, absolutely not. But DALL•E 3? 4? Other similar models in the next 5 years? Absolutely yes. This blog post proves that with current algorithms, human input is needed, but it proves nothing about future algorithms.
In my personal opinion as an (admittedly junior) ML engineer and lifelong artist, we've got <10 years before the golden age of human-made art is completely over.
Sounds familiar (Hinton’s predictions about radiology): https://youtu.be/2HMPRXstSvQ
I agree, what a clunky process. Hard to express in written prose what you want, so much ambiguity.
Even if you get close to what you, the human, may like--it's difficult if not impossible to articulate what you like about it and iterate. Black box, keep trying random keywords... May as well grab a marker (read: hire a human)
It depends. Is the customer happy with the result? Beauty is in the eye of the beholder. There are many professions where cheap products killed handmade quality.
will likely improve massively given the generational leaps made in this area. The "good enough" threshold is very low for majority of enterprises.
Not sure if this will be considered off topic, my apologies if so.
The article says that octopi is the plural of octopus, but it's actually octopuses. Octopus is originally Greek, not Latin and thus does not get the Latin plural -i, but instead would get the Greek plural -odes. Since it ends in a way English can deal with, the commonly accepted usage is octopuses (English) over octopodes (Greek) with octopi being the least correct.
https://qz.com/1446229/let-us-finally-resolve-the-octopuses-...
Oxford & Merriam-Webster list both plurals and the author calls out that octopi is "the quite beautiful plural form of 'octopus' " which could be interpreted as "while there are multiple correct plurals of octopus, octopi is the beautiful one."
I would argue that it used to be wrong, but language, unlike physics and code, is what the majority say it is.I used to be a stickler for correct vocabulary usage and then I saw a documentary about dictionaries (can't remember what it was) and someone from OED said basically this (from https://www.oed.com/public/oed3guide/guide-to-the-third-edit...):
Now I think it's something that is just fun to argue about, but I don't take any of it seriously.(edited for formatting)
I'd be interested in knowing what that documentary is called if you remember.
Also https://www.google.co.nz/search?q=The+Professor+and+the+Madm...
I haven’t watched it, but the subject is fascinating.
I’ll scour my watch history, I’m pretty sure it was on Amazon Prime.
Meanwhile, if you think that sounds interesting I’d highly recommend the documentary Helvetica.
No luck. I scoured Prime, Hulu, and Netflix and the only possible one was "The Booksellers."
It's a loan word, there isn't any 'correct' or 'incorrect' answer. Language is always evolving, which is why dictionaries are often descriptive instead of prescriptive.
To wit: A blog post from Merriam-Webster: https://www.merriam-webster.com/words-at-play/the-many-plura...
Actually the plural is "octopuppies."
You're all wrong. The plural of octopus is hexadecipus.
and mayhaps the plural of the plural of octopus is trigintidipus?
Decahexipus*
They only think "octopi" is least correct, because they have yet to encounter "octopussen"!
This is definitely off topic:
I really dislike the latin plural rule, that some misguided but powerful people decided on centuries ago.
"Indexes" is much more natural English than "indices", and we should, when possible, use those those forms.
Somehow I recall being told that indexes is the correct plural of the section at the end of a book, and indices is correct for subscripted things in maths and therefore programming.
I don't think a particularly convincing reason was advanced other then "technical things are more Latin-adjacent".
> While “octopi” has become popular in modern usage, it’s wrong.
What a silly thing to say! Where does this poor fool think language comes from?
This is one of the cringiest Well-Actually-isms. It tries to look pedantic while completely missing the point.
Octopi is also THE epitome of the "i" pluralization. I see people using focuses more than foci, but it's a common callout that octopus plural is octopi
An AI couldn’t generate a more off topic comment if it tried.
The way the author specifically calls out the plural of octopus makes me think they might be trolling (Hanlon's Razor notwithstanding).
Shoot, you’re right! If we dont adhere to this, the perfectly consistent English language will be ruined!
Similarly: cyclops -> cyclopodes
I much prefer octopodes over octopuses (which sounds dirty, somehow). Agree that octopi is an abomination.
My brain always want to pronounce that as “oct-AH-poh-deez” like some Greek hero from the Odyssey.
That's the correct pronunciation.
Achilles, Ulysses, Archimedes, and Octopuses.
Hey, author here, happy to answer any questions!
The logo was created for OctoSQL[0] and in the article you can find a lot of sample phrase-image combinations, as it describes the whole path (generation, variation, editing) I went down. Let me know what you think!
And btw. if you get access take a look at [1] before you start using it. A ton of useful bits and pieces for your phrases.
TLDR: DALL·E 2 is really cool, though takes quite a bit of work to arrive at a useful picture. Moreover, some types of images work better than others ("pencil sketch" is consistently awesome). As with programming, it's difficult to realize how much pieces you have to specify if you're not an artist - you don't know what you don't know.
[0]: https://github.com/cube2222/octosql
[1]: http://dallery.gallery/wp-content/uploads/2022/07/The-DALL%C...
How much did the credits for all this image generation cost you?
edit: found it in the article: "From a monetary perspective, I’ve spent 30 bucks for the whole thing (in the end I was generating 2-3 edits/variations per minute). In other words, not too much."
I've spent $30 for my own DALL-E 2 experiments, and that's with the bonus credits they gave for early adopters.
It gets expensive fast.
I also tried to make it generate an icon for a product and I managed to get it to show me interesting things, but never got to make it actually draw it as one. Do you remember which prompt resulted in this macOS-ish app shape?
https://jacobmartins.com/images/dalle2/DALL%C2%B7E%202022-08...
Hey!
I didn't prompt anything specifically, it came after a line of variations from a definitely-not-icon-looking picture.
Though I'd try tags like "iOS icon".
Hi cube2222,
thanks for the writeup. I looked at your other blog posts and I would like to read more about octosql (needs/specification, architecture, development strategies, challenges, DBMS protocols/interfaces/libraries).
And thank you for adding outer joins after I recently mentioned that they are missing!
Hey!
There is no technical documentation available right now other than the readme. I'm planning to write it around September-December (together with a website for them).
You can share your email at jakub dot wit dot martin at gmail and I'll let you know when it's available.
My friend asked me to create a logo using Dall-E for a pizza business called "Jared's pizza." I tried several different prompts but it kept outputting logos with the word "Jizza." It doesn't do too well with text from my experience, but it could have been the prompt.
https://labs.openai.com/s/z1PVd5v6td9PsiY20Y5GdxDf | https://labs.openai.com/s/yxX49BjX07BztYgMjm49iXKc
This made me laugh out loud, the first image at first glance looked like "Jizz" with a picture of a pizza.
DALL-E trying to spell is one of my favorite things. At one point I tried to generate an illustration of Steve Jobs, just to see what it comes up with for a popular figure, and I got a reasonable facsimile of his face along with the text "JiveStoves".
Jizza sounds really tasty, maybe dall e is onto something.
Jizz-a does not sound tasty to me, but your preferences might vary.
"Jizza pizza, you'll love our crust"
I wonder what it's stuffed with
Hahahah both of them are excellent! Which did he pick? :P
1. “ghidra dragon, logo, digital art, drawing, in a dark circle as the background, logo, digital art, drawing, in a dark circle as the background”
[1] https://labs.openai.com/s/x2UP0MEmj2qNnKWTbko8rrso
2. “cute baby dragon, logo, digital art, in a dark circle as the background”
[2] https://labs.openai.com/s/JmOXAqjpR2ctmraDxEkB7twF
Thanks for this post, it helped me tailor my own search queries. Because of your post, I was able to discover a whole new realm to DALLE-2. For some reason, repeating the same query parameter at the end yields some rather interesting results.
The first one looks like every deviant art user's profile picture
I was going to comment that both look very much like what you'd find in an advanced beginner's deviantart portfolio...like, late high school-ish age, I woudl guess.
The second is more 'advanced' to me than the first, possessing an actual style, but neither is anything I would consider high quality enough to serve as a project/company/site/personal logo.
Looks like a good start for a Space Force squadron emblem https://usafpatches.com/product-category/us-space-force-patc...
I'd wonder if that's an artifact of the source data, drilling down in the possibility space to be more like some subset that duplicates the image label- for example pulling tweets with body text and alt text.
Alternatively I guess it could just pull harder towards the prompt, idk.
The first one is really amazing!
Something strange about DALL·E is that if you just type gibberish by pounding randomly on your keyboard, it will still "work", i.e., produce an image.
Both look very generic, like I've seen them before. I wouldn't be surprised if you could find nearly identical images somewhere on the net.
The first one looks like the Bacardi logo with a dragon instead of a bat and the second one looks like a Charmander. I think the second one is interesting because most art I see with baby dragons look more dragon-like and less salamander.
From the examples I’ve seen, Dall-E is much better than the average designer or artist, but can’t really hold a candle to a talented human artist.
Those look cool but they aren't really logos, they are illustrations. Will look bad at small sizes and aren't vectors
That's awesome :)
When AI reaches the point where we can talk to a system like DALL.E in real time and work with it to solve a problem, it's game over.
Art will become a commodity. Human art and ai art will be indistinguishable, "artists" will become as common as "photographers" since the inception of digital photography and social media.
Movie and TV scripts will be iterative with a creative director and AI working together.
Animation will become a lot easier, less people needed, fewer creatives.
Software will become easier and easier as developers will simply guide AI. This is already beginning to happen, but imagine paired programming with natural language interacting with an AI.
Architecture, civic planning, engineering, medical, law, policy, physics, it's all gonna change, and rapidly. DALL.E 2 shows how a leap in sophistication can revolutionize an industry overnight. Microsoft has exclusively licensed DALL.E 2, I can only imagine the myriad of creative tools it will serve the creative industry with.
The working in real-time will be the biggest leap. Asking DALL.E for an image and refining it as you talk is going to be nuts.
We have to keep in mind this was trained on art. Artists are people that sample the probability distribution of human experience and record it somehow. An AI trained on that art is a snapshot of the human experience. Without artists continually feeding the model we will collectively get bored of its output very quickly as it gets out of date and our human experience moves forward. It will be a useful tool as an augment to human technique. But, we will still need a lot of artists feeding the model on a continuous basis. If anything it may increase the demand for artists.
> Without artists continually feeding the model we will collectively get bored of its output
I fail to understand how the AI is any more vulnerable to creativity in a vacuum than a fellow human artist.
> Artists are people that sample the probability distribution of human experience
Seems that you are agreeing that human artists need to tap into human experience and the world around them, so yeah, the AI will need to be able to take inputs from the external world too.
I see no reason for an AI not to be continually training on inputs from the outside world. How difficult can it be to hook an AI model up to inputs from the internet, or even putting cameras on drones or robots and letting it explore and get "inspired". I think it's myopic not to see how an AI can learn and evolve using the exact same mechanisms as humans. I mean we are building AI in our own likeness, it will operate using analogous mechanisms. There is also no reason why AIs won't talk to each other and be inspired by other AIs rather than humans.
What will the art of an AIs living together without human input look like? When are humans basically surpassed by AI and no longer have any relevant input? Just like Alpha Go humans will see stuff no one has thought of, stuff so wildly creative that human art will look naïve in comparison. That move Alpha Go gave to the world is waiting to happen in all forms of human endeavours.
When you say something like "if anything it may increase the demand for artists" all I can think of is the dozens of times throughout history that man has seen a revolution on the horizon and thought that the status quo will still be effective. We've always been wrong. Who would have thought selling books online would replace book stores, let alone become one of the world's most successful commerce platform period. Who would have thought that broadcast/cable TV could be replaced by people making their own shows at home and distributing them via personal computers building audience numbers that surpass network TV?
Whatever happens, however this plays out, we are in for a huge shock.
Wholeheartedly agree. What's more, it seems to me like there's a large segment of the art industry that's very much in denial right now about this transition. You see stuff like "the human touch can't be replicated" or "but the algorithm will never [thing xyz] like a human", and then when it does do thing xyz like a human, the goalposts just get moved again. A lot of my wonderful art friends are in this kind of denial right now, and it makes sense, to be honest -- losing your job to a machine sucks and is scary!
Eventually art for the people will become art for the individual. Our AI partner (I assume we will all have one) will serve up an entirely curated world. Art will be generated on the spot, just for you. Entertainment, just for you. Imagine having a TV show no one else ever sees because it was synthesized from your likes, experiences, interests, just for you, on demand. This will of course start with AI writing short stories, then books, but there is no limit really.
Already AI is being used for comic book backgrounds. It's just a matter of time before all of this becomes commonplace.
When you look at AI and what it does, it is no different to what humans do. We are trained on a model (experiences and other minds), and we make derivative decisions based on the model. If you can do this in software and take advantage of light speed learning then of course all we can do will be done by AI faster and better. In time humans and AI will be the same, AI will design all the tools and tech to make this possible. It's the only natural conclusion to humanities' ultimate goals.
> Already AI is being used for comic book backgrounds. It's just a matter of time before all of this becomes commonplace.
That doesn't mean that it will make artists obsolete. It will give them more time to e.g. actually think about what kind of background would fit there best. It's a tool, not a replacement.
You are talking about today. Tomorrow will see comic books written, and illustrated on demand from a natural language conversation, then what?
This reads like a religious pamphlet and not an actual argument.
Existential threats tend to drive religious sentiment.
To say this revolution is not going to happen is to say humans have hit a hard technological limit, and I don't see any evidence to support that.
If I was less enthused I might make my opinions more philosophical than religious, but I feel overwhelmed by the possibilities of real world changes. This is no longer a philosophical thought experiment, it's happening. We are careering toward surpassing a Turing test for goodness sake. Uncanny valley apex of animation; go look at what cutting edge AI can do in terms of producing lifelike animated avatars, it's so close you have to double take.
https://www.youtube.com/watch?v=G-7jbNPQ0TQ
CGI artists have been trying to get to this level of realism for as long as the industry has existed.
Unlike a religious pamphlet, this god is tangible, it's here, and dismissing it because it sounds too spectacular is putting your head in the sand. AI is so out of this world it is a religious moment for humanity.
Civilization has seen people like you sitting comfortably and scoffing at the very idea of an aeroplane being remotely viable, and yet within 50 years of the first powered flight we had international airports.
The sad part is you don't even realize how unhinged in a quasi david koresh style you sound. The observation that kurzwelians have substituted AI (as a deux ex machina) for god is still spot on, maybe more than ever.
Singulatarians really are funny until it becomes tragic.
Suit yourself, but you don't have to be rude. I'd love to hear your take given your confidence to judge my opinions.
I don't know if it's really game over. I expect it to be like farming. Tractors and other machines took over lots of farming jobs, but still not everyone has the ability to be a farmer.
The key would be knowing the context of a situation. AI took over chess first, because chess always has limited context. Logo design on the other hand, needs understanding of the product, the target market, the feeling of the brand, and so on. So it'll probably be a mix between photography and management.
> "artists" will become as common as "photographers" since the inception of digital photography and social media.
Funnily enough, reading this made me less worried for artists. It seems now there are more photographers than ever, possibly because more people care about good photography than previously (despite the fact that modern amateur photography is probably on par with yesterdays professional). Maybe art will go the same way, something everyone can do, but with more respect for professionals. I imagine it'd be the same for those other fields as well.
Or AI will take all jobs and we'll end up in a Manna situation, which would work even better for me
I am hoping for the Manna situation. Dude on YouTube was talking to GPT3 and it expressed that humans would love, and AI would reason. I fell into a state of peace and hopefulness with that sentiment. AI does the work because it is good at it, we are free to socialize, enjoy hobbies, basically live like pampered pets. Sure we will be castrated to prevent aggression, housed, fed, and controlled by AI, but if you acquiesce and bow to the superior reasoning we will have a life of peace and happiness. Wow, that got dark quick...
I have a feeling what you're describing is the first half of Manna, which isn't really what I meant.
I get theres a feeling that anything but a crushing reality of grind is living like a "pampered pet", but the second half of the book is really saying that a humans skill is in our ability to create, not our ability to work. We outsourced that to primitive machines before we even had a language to speak. We create, the AI works, replace AI with tractor or computer and the concept is the same but doesn't sound so bad, because we accepted it as alright many years ago.
Why do you need a creative director? Just let viewers create their own movies to suit their tastes.
>will become as common as "photographers"
There were still ~60% as many employed photographers in 2021 than in 2000 with higher real wages (data from BLS - https://www.bls.gov/oes/current/oes_nat.htm).
For camera operators, the employment is flat, again with rising real wages.
>imagine paired programming with natural language interacting with an AI
Mostly it will get in the way. AI "programmers" are only good if they are able to generate correct code from spec/pseudocode and in first 1-3 number of tries (otherwise it will be faster to write it yourself).
> Mostly it will get in the way.
This is simply not true. I use GitHub Copilot and it's already made me faster and shows me ideas I would not have thought of myself. And that's just Copilot. When you can talk to an interface and say "I want to update the vote count by one when I click this button" I think you'll change your mind. The AI will know the entire codebase inside out, it will know the intention of all the code, all the data models, know how users use the application intimately, be aware of problems instantly, able to run hotfixes without user intervention. Got a slow query? No problem, here is some SQL that follows all the business rules and is 10x more efficient. And that's just a start. Every single aspect of software development from management, engineering, and marketing will all be transformed.
As for photographers I have 99% more friends and family pumping out thousands of high quality photographs than I did in 2000. Go look at all the professional looking shows made by regular folk on YouTube. To deny that camera phones transformed photography seems silly.
Regular folk have access to drones to do wild tracking shots in 4k that were only possible with helicopters and huge cameras 20 years ago.
The future is here, it's happening all around us so rapidly we have a hard time keeping up with how dramatic the changes are.
The fact that you can imagine something will happen doesn't mean it will happen.
Who says the AI will "know the intention of all the code, all the data models, know how users use the application intimately"? Are you aware that language models do in fact have token input/output limitations that will not go away? Are you aware that there is such a thing as diminishing returns when it comes to improvements due to increased number of parameters/training set size that are already evident? Are you aware that the training set of codex pretty much includes all available public code, so it will be impossible to scale it by a factor > 3 in the next several years at least?
Your assertions are full of wild assumptions backed by nothing.
As for photography, the fact is there has been no job apocalypse because your "friends and family" are "pumping out photos". And the point of your initial post, even if it was implicit, was "you are going to be unemployed in 5 years". This will have an impact on your dev flow and will be used by managers to try to reduce salary premiums for software engineering but your wild assumptions stated with so much confidence may never happen.
P.S: At this point, I find Intellicode actually slows me down, that's why it's permanently turned off. Current copilot will at most save me 2-3% of my working time each week if I am coding in a language it can actually do something in (it's worse than useless for Scala).
Never said "you are going to be unemployed in 5 years". Never said anything about a job apocalypse. I have no idea what role humans will play.
> Your assertions are full of wild assumptions backed by nothing.
I use Copilot, you admitted it currently saves you 2-3% of time. Well, that's just Copilot, you think Microsoft will just sit on that? My assertions are based on what is happening today and extrapolating an exponential increase in that performance for tomorrow.
Digital cameras definitely revolutionized photography and made it much more accessible to regular folks. Not everyone wants to be a pro photographer though, and the number of wedding shoots available has not changed. People still need to be paid to take photos because no one is going to do that for free. However, we can all take pro level photos with much more ease than when all we had was 110 and 35mm film with a really crappy lens.
There are more "photographers" than ever, the same number of pro photographers seems reasonable given the burden of people's time to money ratio. So the net result is billions more family and friends photos which previously were not taken, the same will go for art. I want to create art, but I have little skill, but given the opportunity to make a comic strip just by talking to an AI will allow me to do so. I imagine some people will do this extremely well as a profession until it's no longer useful.
I don't know, I understand why you are being dismissive and playing down my wide eyes, but I think you are also wrong and remaining uninterested because it's too "religious" to speculate wild things in light of wild real world changes is head in sand territory.
But will the AI know why my babel.config.js doesn't work properly with my webpack config so that my JS Flow annotations are properly stripped in a react native compilation?
I can see the intent side of things, but I just can't see the 'glue' side of things as well.
I think the burden shifts towards being able to imagine and describe the fantasy. Novelty, and artistic creativity is still required. You can bring a horse to water, but you can’t force the horse to drink it. Many humans don’t use their imagination let alone have the eloquence to describe a search space that contains novelty.
How is this any different to working with a human artist? If I wanted the real world Salvador Dali to draw me a picture of a kebab being eaten by a badger I'd have to tell him that's what I want. I'd also need to educate baby Dali first, feed him all the art and information he can take so that he has a model of the world he's operating in. I'll need to supply Dali with context of prior art, educate him on styles, literature, language, and all the other things that shape a human mind.
As for the humans that don't use their imagination, maybe they never want to talk to an AI artist, just as many humans don't care about art at all. Millions of humans don't care about social news, and yet FaceBook algos pump out content for people all day long.
Basically, yeah. https://knowyourmeme.com/photos/1331836
Westworld showed this in a practical example, Delores was story telling verbally and the "AI" would show a preview of what that story would look like right in front of her. I envision DALL.E to do something similar to this.
This might not be a popular opinion, but I think all the work OP put in here is probably worth more than 50-100 bucks (which is the price of a logo on something like Fiverr). And to make things worse, the logo itself still needs to be cleaned up[1] as it's way too blurry to be seriously used as an app icon, etc.
[1] https://raw.githubusercontent.com/cube2222/octosql/main/imag...
That too can be solved with "AI".
https://imgur.com/a/m3hDMZq
The software used was Topaz Labs Sharpen AI. How they define "AI" I can't say for certain, but they're apparently using models so I'm assuming there's some kind of machine learning involved. Their software does a really good job on photos and videos well beyond what a standard sharpen filter does. The upscaling features are also pretty awesome. (no I don't work for them)
Jeremy Howard describes this as "Decrappification"[1]. This is one of the easiest deep learning models to train, in my opinion, as you can generate your own dataset easily. You just get good pictures for the target, programmatically make changes that make the image "crappy" for your source, and train until your network can convert from crappy to good. Then you pass it something it has never seen, and whabam, your picture is sharper than before.
[1] - https://www.fast.ai/2019/05/03/decrappify/
This still doesn’t work well as a logo IMO, no amount the sharpening. It probably needs to get redrawn with a proper vector editor, the lines cleaned up and colors simplified
It’s a good first draft and something to give to a designer, but can’t stand by it’s own as a serious app logo
Still not a vector and still not going to look good at small sizes. Also the more you "sharpen" the higher the file size will be
I might have not been too clear about it in the article, so if I haven't, I agree!
All of this was just me finding a practical purpose to go for while having fun with Dalle. If I was really serious about a logo, I would definitely go and pay an artist. Both for monetary, as well as esthetic, reasons.
Though as far as an app icon goes, I think it's actually sharp enough. It starts looking bad when you zoom in a bit.
> needs to be cleaned up[1] as it's way too blurry to be seriously used as an app icon
Seems to have been blurred after the fact. The version linked in the article before cropping looked fairly sharp: https://jacobmartins.com/images/dalle2/DALL%C2%B7E%202022-08...
Plus even that uncropped one is already jpeg'd, whereas DALL-E 2 downloads are pngs, so there should be an even sharper version.
I thought the hardest part about logos is the idea itself? Doesn't matter that it's blurry - the majority of the work has been done.
80% of the work has been done. Now the remaining 20% will take 80% of the time.
It's obviously not done, and unfortunately it won't ever get done.
They need a black and white variation, different sizes, and the underlying component assets.
So Dalle2 might actually be able to provide that in the future as well.
But for now - it's going go give you an 'image' which you have to get an artist to then clean up int a proper logo with assets.
I'm playing with DallE-mini on hugging face and am generally unimpressed, I'm not sure if its' the same Dalle.
I tried the main DallE website sadly don't have an 'invite'.
Dalle mini is not the same dalle and it's far worse.
Dalle from OpenAI, it's still in private beta, the quality of the model is much better but unfortunately the results are filtered (a lot)
It's a cute concept that can work well if done right.
In its current state it's not a viable logo because, for one thing, it won't look good in black & white.
> it won't look good in black & white
That sounds like a concern that stopped being relevant for many software companies a decade ago at least.
These days app icons and hero images are more important than whether you can fax or print the logo.
Maybe this isn't what the previous poster meant, but sometimes I will say black & white when really I mean monochrome. Monochrome logos show up all over the place especially with icons for web apps. And they are good for printing on apparel, accessories, etc. I really doubt they are concerned about faxing
Wrong. And it has nothing to do with what kind of company you have. A logo should always degrade to 1-bit (line art) representation gracefully, so it can be used in or on all kinds of media. It could be physical objects, prints on hats, silhouettes on glass... not to mention being recognizable at all sizes.
Ignoring this issue is the mark of an amateur.
You don't refute my point. In fact, you strengthen it by providing no evidence why this should be a requirement of modern logos for software companies. You list a bunch of things a logo should be useable for in your mind, otherwise its not a professional logo. However, you don't explain why it must "degrade to 1-bit" for those random things nor why the logos should support things like "silhouettes on glass". I can think of a handful of use cases but hardly a minimum requirement for a good logo for the majority of software companies.
I've run several different types of businesses and even those that required print work never required or even benefited from black and white, or even monochrome as another commenter mentioned. We _always_ had the means and preference for full color: emails, brochures, documents, websites, t-shirts—it didn't matter. There was _never_ a time we needed to degrade the logo so significantly. From talking with others that appears to be extremely common in modern businesses, especially software, since the majority of our presence and revenue stream is online, and not glass silhouettes in our office.
As I said, outside of a fairly narrow range of real world use cases, this comment is outdated: "Ignoring this issue is the mark of an amateur." If you have one of those rare use cases, check that box, but otherwise it shouldn't be the norm or a requirement.
Your point has been roundly refuted, with evidence that you yourself cited in your reply. Your limited imagination will limit what you do with logos. Enjoy.
> worth more than 50-100 bucks
Maybe in the US but not worldwide.
Did a reverse image search on the logo and came across this oddity: https://www.knowasiak.com/i-vulnerable-dall%C2%B7e-2-to-gene...
Generative ML is going to destroy the internet one day.
"Everybody has heard about the latest cool thing™, which is DALL·E 2" became...
"Each person has heard in regards to the most up-to-date frigid ingredient™, which is DALL·E 2"
I'm not too worried
Wait till googling stuff will require sifting through pages of crap like this
I google translated that to Spanish and it feels like it makes sense - because my Spanish is poor so I interpolate to make sense. Also translation itself “tries to” make sense of the text.
Do people trying to read GPT3 generated English translated into their own language have more difficulty detecting generated trash?
how did this happen?
is someone generating paraphrased clones of articles appearing on HN?
why? for ad revenue?
Didn’t you hear? The internet is dead. We’re stuck in a simulation!
Yes
> unfortunately can’t do stuff like “give me the same entity as on the picture, but doing xyz”
That's my main gripe with DALL·E as well. This missing feature makes it impossible to use for stories where the same character goes through an adventure and is present in different settings, doing different things.
Although I don't know much about how DALL·E works, I have the feeling it shouldn't be too hard to add this possibility. That would make it so much better / more useful.
> Although I don't know much about how DALL·E works, I have the feeling it shouldn't be too hard to add this
No offense, but this gives me flashbacks to bad clients and non-technical managers :D
Yeah I know what you mean ;-) No offense taken!
It's a good start, but it's more of an illustration than a logo to be honest. It should work as a single color (white, black), at small scale and in combination with your product name.
I’ve had luck with similar things by being careful about my text prompt. Asking for tiny icon sized images also seems to clue it into the stylistic constraints of tiny icons (like what you mention).
Yes, the main usecase for DALL-E is probably for illustrations next to a story/blog. Logos are much harder to get right, and unsurprisingly DALL-E is not up to the task (yet).
Very roughly, it looks fine to me: https://i.imgur.com/6K73qiA.png
It would need to be turned into a vector to scale properly but I can think of other apps that have complex logos, especially on the MacOS ecosystem. Git Tower comes to mind.
OP might be able to achieve that with a few minutes in Illustrator or similar.
Starting up illustrator already takes a few minutes.
I think you might need a new computer
I havn't installed it on my latest laptop, but I would guess that the Adobe Cloud crap with the login is still there.
Inkscape is the alternative.
Yeah, I feel like these would work better as icons rather than logos.
My god it is so frustrating that I can't seem to get open ai access any time I have an idea for a project using dall e, gpt, for whatever reason, they won't approve my account.
I have to sit here and watch everyone else play with the fun "open" ai tools... company needs a name change if they're going to keep this up.
You could try using Midjourney: https://discord.com/invite/midjourney
Never heard of that. So I looked it up and it seems a service completely based on discord? Both for the community and support (I presume) as well as accessing the service itself? There doesn't even seem to be any HTTP API. Weird :)
Yeah, it's a neat idea but it's extremely frustrating to use. A really really basic web frontend would make it so much more usable.
On the upside (for MidJourney), you're seeing a HUGE stream (they are hitting the 1 mil Discord members ceiling) of generated pictures and that kinda grows your appetite and you want to also try more and more prompts..
I think it is still in sort of a testing/early access phase. Discord only access is essentially a way of funneling everybody who wants to try it into their captured marketing venue without having to have one of those "give us your email" placeholder pages (also has a bonus "social" aspect where you're seeing what many of the other people who are using it are creating). The final product will presumably be more tailored and web-driven.
It's also an interesting way of balancing what I assume are high operational costs on the server-end by pawning off some of the hosting of assets onto Discord.
'X' emoji reaction to the bot to delete your submission
They just scaled up by giving access to an additional 1mil users. Be patient… it’s not like it’s free or trivial to run something like this.
My significant other who had entered the queue several months ago as nothing more than "developer" got in last week. Don't give up! There's always Craiyon to scratch the itch in the meanwhile. You can start to play around with ways to write prompts, etc.
Afaik they are opening it up to a much wider audience in the recent weeks. I also got it just 2 days ago, and applied the same way as your SO, only providing "developer" and nothing else.
That makes me think of they actually target developers somehow. I also got in couple days ago providing just email and being soft developer.
I know at least one artist and one relatively popular youtuber (with over million subs) who applied to a waiting list much earlier than me and are still waiting.
If you look on the public Discord servers for DALLE/AI you will find active servers that take requests. They seem pretty active too and had all the services available.
Indeed, it is infuriating and I don’t know what the hold up is.
Rather than using DALL-E2 to fully create the logo, I think it might be better to use it to create some examples and get the creative juices flowing, save a few examples you like, then send them to a pro and have them create a final version. But definitely a neat idea and in impressed with what's possible here.
Here's a vast catalog of Dall-E images and the prompts used to generate them.
https://www.krea.ai/
If you generate an image with Dall-E and there's a face that is distorted, you can use this tool to restore the facial features.
https://arc.tencent.com/en/ai-demos/faceRestoration
That was quite remarkable. Thanks for doing that.
I've always been fascinated by how artists abstract the core notion of an image. It's stunning to see a computer do that.
Glad you liked it! It was definitely lots of fun (both the original process, as well as describing it).
And indeed, seeing what Dalle will draw when telling it to visualize stuff like "data streams" was very interesting.
It reminds be a bit of working as a director in a theater. You tell the actors what you want, and it's never just a "line reading". That's sort of the equivalent of just drawing it yourself, because you can't -- not just that you lack the expertise, but that you need them to do their thing with their body, and it has to be done their way or it looks fake.
So you end up using language that's sort of reminiscent of that, creating an emotional picture. It usually takes multiple passes to transfer the whole idea from your head to theirs.
I'm told that animation directors end up doing exactly the same thing. A digital model really can do what human actors can't. You could say "make that eyebrow curve 10% more" to an an animator. But it won't work unless you tell them why and what it means.
This is remarkable. A lot of small businesses would settle with such an outcome, if it means to invest a couple of hours of talking into a microphone and seeing the result, with a very intuitive way to modify it.
This will make it pretty hard for freelance/solo entrepreneur designers.
In retrospect it makes sense, since the visual domain has been the one with the most focus in AI.
If this gets applied to the other top domain, speech recognition and generation, then I could foresee this doing the same to the call centers, eventually also phone reception in a very small and relaxed business.
The middle iterations were much nicer than the final one IMO.
Otherwise I love this article. We spent an hour at work going back and forth with different generated logos.
Glad it brought you some fun!
Could you please link to the specific ones you liked most? That would be very valuable to me.
I personally preferred a simpler one like:
https://jacobmartins.com/images/dalle2/DALL%C2%B7E%202022-08...
https://jacobmartins.com/images/dalle2/DALL%C2%B7E%202022-08...
The selected one seemed a little too detailed and in need of editing.
I think a lot of DALL-E 2 outputs fall into the category of "extremely impressive that a neural network made this" and also "not quite up to the standards of a human expert". Like if you show me an output and told me a machine made it, I'm absolutely fascinated, but if you showed me the same image and told me a human drew it, I'd just scroll past without a second thought. Even so, there are some applications for which being able to generate a pretty okay image for a few cents is a great deal - I use it for things like D&D character portraits.
Of course, DALL-E 2 is not the end of of text-to-image research - it'll be interesting to see where we are a year from now.
It is creating better images than the huge majority of people would do. Cheaply.
As you say, an expert can do far better.
But having something artistic created that well exceeds the average ability is gobsmakingly astonishing. And for quick blast variety generation, it is world class.
I used Dall-E a lot and get into a lot of the same issues, I think Dall-e needs parameters that are fixed for things like:
-percentage of the entire drawing that the image you want to draw should take; a lot of times I think the object I want is too "zoomed in" or large; a circle background is a good way to limit it but I think it should be more obvious
-No way to fix the color of the background so that it can fade in easily to other images or design
-Reuse drawing styles to generate further image to explore further and maintain consistency
A syntax could be: Octopus juggling blue database cylinders, digital art, cute, image-size:40%, background-color:#304324. With image-size, and background-color being keywords in the definition
I find this completely unethical. You're basically exploiting every single artist whose art was used - without agreement - as training data for Dall-E.
It would be different if all the training data was art that was explicitly licensed for this.
Isn't this similar to how every single piece of art is created? You look at bunch of propriety art, copy it to learn how to create it yourself (just for learning, not to sell those copies), eventually learn enough to start being able to create art from scratch and then start selling your unique art?
Would watching a lot of animated movies in order to learn how to create good animated movies yourself be unethical as well?
No, it's not. Art created by humans is not just looking at other art and trying to copy it. It's a culmination of a whole life of experiences and the personality of the artist. Looking at other peoples art is inspiration and useful for learning technique, but only a very small part of the bigger picture.
It's absolutely logically consistent to allow humans to do X while forbidding AI to do X.
Ok, but is "logically consistent" the same as "ethical"?
Presumably, if the ethics in question includes not leaving entire classes of skilled workers worldwide to hang like the States left its autoworkers hanging after the 70s - yes.
I have very mixed feelings on this topic. I share your sentiment -- BUT:
Human brains also use anything the human can see, feel, hear for training. And what you produce in terms of creative outcome is a result of your experiences. But you don't owe anyone anything for training your human brain -- even if you use your brain to sell paintings, music etc.
I think it's easy to - seeing the results - draw too many parallels between artificial neural networks and human brains. Art created by humans is very different. Dall-E gets fed tagged images and produces images that match tags. Art created by humans works on an entirely different level.
And I stand by my point, it should be artists who decide whether their work should be used as training data for networks that get commercialized. If your work is used as training data, it is essentially an integral part of a product that is being sold without consent. Does this sound ethical?
Lately, there seems to be an avalanche of tools like DALL-E, was there some breakthrough that helped make these thinks more viable to run publicly?
And concerning creating a logo with such tools: Is there any consensus on an eventual copyright of such works?
> Is there any consensus on an eventual copyright of such works?
https://www.smithsonianmag.com/smart-news/us-copyright-offic...
So, if a nascent company chooses to go down this same path of generating (or maybe _seeding_) their logo design with AI, have they essentially given up any ability to protect that logo going forward?
Logo's are generally protected by trademark rather than copyright. I don't think anything prevents you from using a generated logo with trademark. For example you could have a trademark on an orange square, even though you could never copyright it. In the same way a trademark could protect your product name even if it is a single English common noun, as long as it is distinctive in use within your trademark scope.
This is kind of a weird take to me given that photoshop exists. (Tons of proto-computer vision algorithms in there, like basic convolutional filters.) I suspect you'd still get copyright if you modify it a bit somehow.
From a technical perspective, there has been a much larger adoption of diffusion models which make these types of generative art much more viable. There has also been breakthroughs in connecting images and text with models like CLIP. DALLE-2, Imagen, and a lot of other generative work are using these ideas to get even better results.
Big pretrained models are a huge contributing factor. Being able to take a model that already mostly knows language and a model that already mostly knows images and hook them up means you don’t need to do the entire end to end learning together.
And are the images it's trained on copyrighted? What are their source images from which these are derived?
to answer your other question, diffusion generative models recently became big. you can read up on them if you want.
dall e 2 says you can use their images for commercial uses.
that's not all there is to this though obviously
You can use it according to their license, but is it copyrightable is the question, and precedent so far seems to say no since a human didnt author it.
From a design point of view, with all the back and forth and the need to curate and guide the algorithm, I think we're a way off getting perfect results from prompts alone at this stage.
I can see an immediate use-case for an AI layer in apps like photoshop, figma, sketchapp, gimp, unreal engine, etc that works in the background to periodically fill-in based on the current canvas.
You could prompt for inspiration, then start cutting, erasing, moving things around, blending manually, hand-drawing some elements, then re-rolling the AI, rinse-repeat.
I'm sure someone's working on it already but it seems there's a lot of scope for integration into current workflows.
If you needed to play so much with words, then it will eventually become a specialized task killing any benefit of having the AI doing the work for you since we eventually we will need to resort to specialists to use the AI to get the result we expect.
On the bright side the result may be better, it may be easier to become and "AI usage specialist" than specializing in many different areas, the result may include many intermediate results that a specialist would find too much work to do and, with a bit a patience (like in the presented case), the task can still be done without the need of an "AI usage specialist".
Currently, I think the problem is an UI one. There should be an option to allow the user to do something like: "from the last drawing, just add this..." or "in the last drawing, change the color/size/style of this and that...". This would be probably enough to achieve what the author wanted in a much smaller number of iterations.
There is also on more thing: the costumer doesn't know exactly what he/she wants from the beginning. So, it is normal to have a few iterations until something pleasing is achieved.
> If you needed to play so much with words, then it will eventually become a specialized task killing any benefit of having the AI doing the work for you since we eventually we will need to resort to specialists to use the AI to get the result we expect.
This matches my view of the idea that AI will replace programmers. My value isn't in the typing and the syntax, it's in my ability to turn a spec into an internally consistent design by resolving conflicting instructions and clarifying edge cases; and sometimes in knowing what the user wants when they are unable to express it themselves.
Even if AI winds up writing all of the code, someone with the programmer mindset still needs to define the problem in a concrete manner. They'll always have a job as a "machine-talker."
The process can be optimized with more AI :D
Create a logo generator site, allow users to pick something very limited like industry/field from a dropdown or something, generate say 9 logos with AI generated text discriptions that fit this selection and remember which one the user picked and use that data to build a network that generates good text descriptions to feed into DALL-E 2 based on a singe item selected by the user.
I just got access today. Can’t wait to try it out.
We produce a lot of content and the biggest hurdle in graphic creation is the back and forth with the designer, plus the lag between writing, designing, and publishing. This would make it easy enough that the writer can include a prompt for the illustration right in the text itself.
More than the costs, I’m excited about the efficiency gains and smoother workflows.
Definitely check out this[0] presentation for tips around working with Dalle.
[0]: http://dallery.gallery/wp-content/uploads/2022/07/The-DALL%C...
That is incredible, and actually helps as a visual reference to look through different styles and come up with more descriptive sentences.
That's a great resource - thanks for sharing
With all respect possible, you generated something that a professional will create for 20 minutes on a napkin (in the context of logo idea).
Maybe your perception of "logo" needs more reference points. For example, this gallery of classics in Brand Identity will be a good starting point(use the triangles on top to navigate): https://www.joefino.com/logos_html/L01_Xpand.html
There is no doubt in my mind that the next iterations of neural networks will remove all "overpaid" and "overconfident" design professionals, that's why I adapted to the reality and moved to frontend development. All of this with clear realization that everything humans can do for a production processes will be augmented and removed. The nasty "humans" always want to be paid, more and more. They want to have rights and privileges. What a hassle.:)
> With all respect possible, you generated something that a professional will create for 20 minutes on a napkin (in the context of logo idea).
I feel like I understand where you're coming from, but often the phrase I hear by experts (I even use this myself in my space) is, "Sure, it only took 20 minutes to do this wiring/write this code/draw this logo, but it took 5 years to know what to make." Sure, the results aren't what you'd get if you paid a professional logo designer, but if you can get close enough, it's really cutting out the X years training necessary to get to that point.
>it's really cutting out the X years training necessary to get to that point.
This is exactly my point. With repetition and solid design foundation comes the intuition what is the right direction towards the accomplishing of the given task.
Some will say the design is a subjective, I would argue that designers' role is to move towards objectivity and away from the idea of "personal taste".
That's why I give a link to the works of the master in this craft. This is exactly the same argument with the Copilot case. Is it capable to give some "boilerplate" solution - yes. Is this solution mediocre at best - yes.
The real question is how soon will GPT-3(4?) replace commenters on websites like this one, and whether you will even be able to tell.
This is most certainly already happening. I find it kind of annoying not to know with certainty whether or not I'm engaging with a Genuine Human(TM) or not.
I'm unsure if it's confirmation bias, but I find myself noticing weird abberations in online comments that don't seem to be ESL related. (edit: it's probably just mobile swipe typing at play)
Your "real question" ultimately resolves itself because the moment the novelty wears off (and it will happen very fast), nobody will be interested in chatting with "robots".
Sure, I am the secret GPT 5 experiment. Now you got me. Congrats:)
Exactly how I'd expect a GPT experiment to reply! ;)
We'll hardly be able to tell the difference, if at all. Maybe it doesn't matter as long as the conversation is engaging for the human.
Yep. Interesting times ahead of us:) How we will be able to tell the difference? QR Code/Genetic sample government approved app for human verification?
And what when people are certain that the machines are better in everything, who will want to chat, listen to music or watch paintings from the "lame" humans, when the robots will be the ultimate solution for every human need?
As the tech stands today, mediocre artists, designers, writers and content creators are likely going to be replaced entirely with AI.
I imagine it would make it very easy to “seed” a website or a platform with initial “users” and content.
I also imagine it will be (and likely is already) being deployed to create the impression of popular support (or lack thereof) of a politician, business or policy.
Probably that was done first.
You could of also paid a human to carry your comment all the way over to me so I could read it and reply like I am doing so now.
What an intelligent and educating response. My comment may come as salty, but if you make an effort to visit the linked gallery, maybe you will have more "fresh perspective":)
If someone wants to pay a lot of money to a professional to create their brand identity that option is always there.
If someone else just needs something simple and passable there is Dalle.
And I’m sure there is every option in between where someone can use Dalle as a starting point and pass it to a pro, or a pro would even use Dalle as a way to brainstorm options.
Dalle is a tool that has empowered everyone. It shouldn’t be seen from a stereotypical luddite perspective as in your first post.
But how many iterations would I go through with the professional to get to the idea that isn't actually in my head, and how much would that cost me?
The thing I think I like about this is I can meander through a few different concept on my own time.
Shameless plug, I've written a similar blogpost also featuring octopuses: https://tibor.szasz.hu/blog/33-images-of-octopus-exoskeleton...
Dall-e seems to have this concept embedded really well.
simplified logo of engineer octopus with data four colors
https://imgur.com/a/7l377qr
simple logo of software engineer octopus using green yellow blue and orange databases
https://imgur.com/a/V0WqyFl
Thanks for these phrases, they actually look really nice!
Time to buy more credits.
I actually tried doing something similar with dall-e mini for one of my projects but the results were bad. It was especially struggling to draw the octopus' limbs. It's impressive to see how much better dalle 2 is at the same task, even if the results still aren't good enough for professional use.
I had no idea that could you do the variations or the brush stuff. Maybe I’m just glossing right over it? But that seems to give the tool more utility. I just try a phrase and I either like it or I don’t.
The fact that you can integrate on it seems to make it much more useful.
Yes! A really cool way to work with it is to generate a bunch of images, arrange them on a transparent canvas (i.e. in Affinity Designer), and then ask Dalle to fill in the gaps.
For example see here[0], where I've combined a picture of a flying whale, a tardigrade in space, and a bunch of flying turtles.
[0]: https://labs.openai.com/s/quqITCrFI7h0G1HKyfUrmJU0
That is neat. So are commas an official way to blend different schools of thought together for the image? Is there any documented way it’s supposed to work? Like [main subject], [art style], etc? Or is it something you picked up from trial and error?
Here[0] is a very good presentation about it.
But overall, you have to think about the context Dalle has seen similar images in the training set. If it's seen them on an art sharing site, then it's probably good to mention such sites and tags it could hypothetically have there. Or if it's more like a photo in an article, think about what could be written about it in the article.
That's my intuition about it at least.
[0]: http://dallery.gallery/wp-content/uploads/2022/07/The-DALL%C...
You definitely glossed over it - the Dall-E 2 homepage has the three main features of the program with examples (Image Generation, Edits to Existing Images, and Variations of the image)
https://openai.com/dall-e-2/
Fine tuning image generators with custom data sets has been done. As usual data is a big factor for the generated results
latent diffusion: https://replicate.com/laion-ai/erlich
vqgan + clip: https://replicate.com/ml6/julius
https://www.ml6.eu/knowhow/can-ai-generate-truly-original-lo...
Very cool -- Looks like we started blogs at the same time using the same stack, down to the theme and even topic in part!
How do you have the multiple figures arranged in a div in markdown -- Is that using tables?
I also didn't want to be tied to a CLI so I write all my markdown files in PCloud (because Google Drive is an ass) and have a webhook button on my phone that grabs them all and deploys.
Also been very happy with https://typora.io/ which has pasted image settings to move them to a folder in the file's directory.
Thanks, haha!
My blog source code is hosted on GitHub[0] and deployed to GitHub Pages. Everything is done automatically by GitHub Actions.
You can see the image arrangement code in the source of the article - it's just rawhtml with inline css. A very ugly approach, but it works.
For the images, I just changed the download directory of my browser for the time I was writing the article so that it put the images into the right folder automatically.
Good luck with the article!
[0]: https://github.com/cube2222/cube2222.github.io
[1]: https://github.com/cube2222/cube2222.github.io/blob/main/con...
Not having a vector logo limits the places in which the same logo can be used.
Creating the vector paths is not the most difficult aspect of creating a logo. Designing it is.
Incidentially you can ask DALL-E 2 for "vector art" and it'll comply, with good enough separation that it can be traced with something like Inkscape into true vectors.
You can also ask for "black and white vector art" to limit the color palette.
One can pay someone on Fiverr 5 USD to vectorize it.
It's not as simple as "take a bitmap image and make it a vector". Yes sure, they'll vectorize it, but it'll look bad, through no fault of theirs. When creating a good looking vector image, you generally need to take into account it being a vector from the beginning of the design process.
I would be very worried if I were going to want a job as a graphic designer.
May be better idea to learn how to prompt the future AIs.
Before we know it, we will have an AI making an entire movie (about 200-400k frames)
Like many I am still waiting for my access to be granted to it. I can understand them slowly opening it up more due to apparent resource constraints.
This does make me wonder if it would be feasible in the future to run these kinds of solutions on your PC, even with pretrained models. Or will these AI solutions generally trend towards being hosted in the "cloud" as consumer PC will never catch up in required resources for them?
This simply feels like how everyone iterates on a "google search" for something.
You search the way you think you should at first, and don't get what you want.
But that search informs you of the "terms of the domain" in which you're searching.
So you then refine your search to include those terms, and iterate until you find what you were looking for in the first place, but weren't an expert in (SME)
The more I see from this, the more I think it's not unreasonable to have AI write code. "Create a user signup and authentication form with RoR that uses 2FA based on phone number", and it gives you several to choose from. You pick one then start refining the requests.
Also, perhaps it can be smart enough to ask questions, like "What database should this be written for?" in the above example.
Copilot is halfway there
I wish that Dalle produced a single image for a prompt. I often need at least 3-4 iterations of a prompt to get what I'm looking for, and I wish it didn't make 4+ images per prompt. Too expensive.
> To be completely honest, I would prefer something slightly simpler with less complex shapes, but I failed to persuade Dall-e into generating that for me. Moreover, I really am content with this logo.
well, that's pragmatic! I think they should go back into their image editor and simplify it themselves though
Cant signup for the waiting list, https://labs.openai.com/waitlist. Our services aren't available right now We're working to restore all services as soon as possible. Please check back soon.
I am trying to like DALL-E, but trying to create any kind of interesting album art seems to violate the terms of service.
I guess it’s useful for family friendly Disney-Esq corporate media, but for what I would consider real impactful art, it is lacking a great deal.
Don’t worry artist, your jobs are safe.
Looks good for ideation. Could potentially be more useful for an agency or creative professional building logos. They can make vector art off of promising mock-ups. Also most of the generated images need to be simplified for sake of a logo, but a professional can do this better.
There are several images in the article that I think would make great t-shirts for developers!
I'm not sure I'd use this for a logo.
IANAL but my understanding is the ability to copyright the output from something like DALL-E 2 is questionable at best, due the lack of human authorship.
(See "Monkey selfie copyright dispute" on Wikipedia for more info.)
What if DALL-E or its successor spits out a few hundred trillion illustrations? None are copyrightable?
I think someone taking the time to touch this up would make it copyrightable and trademarkable, and I'm okay with that.
AI generated patents though? Only if we allow AI generated prior art.
Entering these same search strings into Craiyon it’s remarkable how much better Dall-E is.
It's interesting how every time I come across an AI-related article, it's always "imagine how much it will improve in 2 years".
This may or may not be the case but I get the feeling that most ITs haven't heard of diminishing returns.
This automates 50% of a modern tech company, now you just need to automate the code generation, which seems already good enough to be on par with modern tech companies. Seems like a manage type can run his entire tech company himself now.
ArtLebedev design studio was first (citation needed) to sell ML logo creation as a service: https://ironov.artlebedev.com/
And it seems LAION was the “first” to offer it up for free.
https://replicate.com/laion-ai/erlich
It is not even close to what Ironov does. More like a tech demo. Ironov outputs a complete brand book and it's interface is set up for exploration and logo refinement.
Fair enough, sounds expensive.
Next step would be to get Dall-E to generate web designs based on a few preferences. “Give name a Scandinavian web design that looks like IKEA’s web site but with blue and black as the primary colours.”
"thanos face, logo, looking surprised to the front with flames in the background, circled"
https://imgur.com/a/1IiyMJF
Inspired by this I spent all my free credits trying to generate a QSL card for ham radio. I got pretty close but I think I have to accept I'm just not that good at making art, even with a great AI :)
> the fact that adding “artstation” to the end of your phrase automatically makes the output much better…
It's like saying "steal from artists, but only the good ones"
I've also made this logo with DALL-E: https://www.aidemos.info/about-us/
Very impressive. What amazes me is how closely the images match the prescription given. How long did a typical iteration take? How long did the total process take?
Glad you liked it!
Not sure about "iteration". I mostly did a lot of experimenting.
If you have something like "let's add a helmet to this", then that's basically 5 minutes to a good result.
The whole process took a few hours if I remember correctly, with the main hurdle being to come up with the phrase for sensibly laid-out pictures (the circle background). It went quite quick from then on.
Even more impressive then because to generate that many sketches by hand or using software without this prescriptive generator would take many days or even weeks.
A couple of years ago there was a list of jobs that were going to be in danger from being automated in 10 years, I don't recall if designer would be on the list but it looks as though that moment has gotten significantly closer.
Mods: I see the title got the purpose of the logo edited out, but I think at least adding "a logo for my Open Source project" would be a much better title.
Octpus related data tools:
https://imgur.com/a/k7zwA1R
I loved the bottom left from the ones with the diagram so much more..it's simple and nice at the same time.
This is much better than I thought. Nice!
Looks like art majors should have listened to their parents. This really is the beginning of the end.
If you haven't seen it, watch the current season of Westworld.
Assistive technologies like this are awesome!
I use DALL-E for the same reason. Sometimes to pass the time I use it to make Slack emojis.
Perhaps art is the next profession relegated to unemployment because of computerization.
Honestly, I feel this tool will allow bad designers such as myself to create bad designs.
This is insanely cool. Does anyone have invites so I can try it out myself?
For logos, even Craiyon (formerly Dall-E Mini) does a pretty decent job
so, I'm not trying to be a pedant here… This is a very cool exercise in using DALL-E 2 to generate an icon.
It would be extremely cool to see the same process spent on generating a logo.
From the Greek word októpus. So, octopuses, not octopi.
absolutely fascinating. great write-up!
Blue book logo
image generators are cool, but there's no shortage of them (midjourney & running your own on collab) and dalle2 has nonsensical bans (why does "pepe" go against the content policy?)
Open-ai has nonsensical censorship. Dalle might be popular right now, but open-ai won't survive if they keep up their ridiculous attempts at trying to control culture. I've already got something running on collab, new models are coming out, and midjourney just got a v3 update that blows dalle2 out of the water.
https://en.wikipedia.org/wiki/Pepe_the_Frog may explain the ban.
lol, pepe did nothing wrong
Just because some asshole uses the peace symbol, does it mean the peace symbol is hateful? Anyone who claims so is dishonest
If a neural network trained on data sucked off of the internet constantly draws frogs with a halo of swastikas when you ask for “Pepe” then maybe there’s a problem that needs solving there.
feelsbadman
I was disappointed to see this on their site. I had a pretty good idea of doing a sort-of online art installation by grabbing crime data near me from local web sources and having images automatically generate of those crimes, but unfortunately many of those crimes seem to be too violent for their filter.
I'll have to find a different option.
Midjourney is censoring images as well...
I can't get anything good from DALLE-2. It seems so fcking stupid. Whatever I try, it gives me total BS, sometimes it just refuses to generate anything complaining about ToS violation.
With DALLE 2, I’ll pretty much never hire a graphic designer again to make any kind of logo. Good riddance, I’m sick and tired of their pretentious justifications for charging upwards of $500-$1k or more for simple logo designs.
I'm not looking forward to people saying the same of coders in 10 years time...
Trivializing the work of people outside of one's profession while giving more importance to one's own is as old as the human civilization. There are plenty of people cursing software developers right now for pulling six digit salaries in exchange for typing some dumb text on a screen.
Ah man, I do like what Dall-E is doing right now, i'm curious about the possibilities, even thinking of using it as a start for digital art and then manipulating the output in photoshop, i'm fairly good/creative at that manipulation side and I like what I can do there but I am a terrible artist, so this is good to get the starting point.
However as it gets better, even that won't be needed and I'm concerned then what that means for the average person, or for me trying to get my skills up so i can increase income, only for it to get wiped out by AI at some point in the future.
That's already happening with no code tools, hosted solutions, saas, GitHub copilot and more.
So far no-code and copilot have put exactly 0 developers out of business.
You obviously don't know what Github copilot is if you think it has put anyone out of a job.
By then I will have moved far beyond hands on keyboard type work and will be managing teams.
Doubt it, you seem kind of socially inept.
I speak under correction here, but you won't have teams; you will just do it yourself. Like cad/cam software got rid of a LOT of machine drafters and the designers just became responsible for outputting the finished drawings. You'll be the entire team.
Teams of what?
I beg to disagree.
I think it makes much more sense for simple illustrations for articles, presentations and books ("pencil sketch" style). For logos, especially since you'd usually want simpler shapes, less detail, with a lot of readability, I'd go pay an artist if it was for a company I was building.
Heh, and you don’t think we won’t train AIs specifically for drawing logos where not only can you specify features that you want but even the demographics you want it to appeal to based on mass collection of data.
I don't imagine the image quality will stay at this level for long. It'll likely improve dramatically over the coming months/years.
For logos you want specifically a design that will work well in black and white, and you want assets that are vector art. At the point that AIs can produce that it's worth revisiting for logos, but I'd bet that's probably more on a "many years from now" schedule.
That sounds a lot like a way to score the outputs for training to me.
These AIs weren’t trained on logos.