This week I continued my work I started last week with the AtlasNet generative autoencoder network. While I have not yet achieved satisfactory results, I did make some important understandings this week that helped me understand the network better.
Scaling training up to all meshes
Last week I ran for 120 epochs of training using a limited selection of 3 meshes from the 30 sword dataset I created before. This week I started off by training on all 30 meshes. I had to normalize each of the meshes as before, but once this was done training took only 8-10 minutes. However, looking at the results I was surprised to find that almost all sword models were reconstructed like the one I tested last week, with only some slight variations. This seemed to disregard features such as large curves in the blade that I would have thought would affect the mesh shape.
Next, I tried training on a single mesh repeated, thinking I could attempt to force overfitting to at least see if the network would learn to reconstruct that one mesh. However even in this case the reconstructed mesh looked the same as before. Confused, I decided to try training on some of the original data instead, choosing the plane category.
I trained on 40 different planes and was shocked to find that the network output a (comparatively) near-perfect reconstruction at the end. This was even with fairly significant intra-category variance in mesh shapes, such as spy planes, biplanes, and jetplanes. Continuing this investigation, I tried a similar tactic as before and trained with just a single biplane mesh duplicated, and got similarly impressive results.
At this point, I began suspecting that somehow a pre-trained version of the network was being used, or at least aspects of it, and that this was causing it to perform unreasonably well on sample meshes. I tried removing files from the
trained_models folder, and after some tries eventually realized I was right. My mistake was that every time I trained the network, a new file was posted in the
log folder, but I was using the same pre-trained network every single time.
This put all of the previous results into perspective, such as why all swords were coming out nearly the same (the network was never trained on swords) and why the planes were all reconstructed very well (the network was trained on planes!)
Now that I understood what was wrong, I was able to switch to using the trained networks and test those results. Unfortunately, those were not necessarily great. Essentially the way that AtlasNet works is by sampling several points in 3D space as the locations of “primitives” in an imaginary 2D atlas (like a texture atlas).
A certain amount of points are assigned to each primitive, and these points gradually learn to spread out over time to match the training point cloud and decrease the model loss.
However, in my experiments these points did not seem to spread out nearly fast enough, and at the end I was left with several disjointed planes. I attempted nearly doubling the number of epochs as well as quadrupling the number of learnable primitives, but none of these steps led to the model learning any better representations.
Even though I did not make groundbreaking progress this week, though my mistake and realization I came to understand how AtlasNet works and that, if I can train it right, could give me very impressive results as seen with the planes. On the GitHub page the paper author recommends to another person that they have at least 1,000 examples if they are training their own data; I had only 30 at most. Thus at the start of this next week I plan to take time to expand the scope of my dataset (which will take a good deal of time). However, I think before this I should try training and reconstructing from a dataset of around 1,000 of the planes or other sample data to make sure that this would actually work before I put in all that effort. If the results are good, this may be what will let me unlock the true potential of the AtlasNet network.