PDA

View Full Version : Updates to The NIN Hotline (the platform)



Leviathant
11-30-2023, 09:02 PM
If you've been following what I've been posting on the website and on social media, you know I've kicked up a big fuss about really restarting the website and decoupling from social media. It's something I really focused on back in June & July, mostly with bugfixes, and then by the end of August, I launched the new "cover art" homepage, wherein I've been updating new backgrounds and new 'album covers' for the website whenever we have sizable news to post.

I figure I should write down some of what I've been doing under the hood as well, because I'm feeling real good about finally executing on some ideas that have been rattling around in my head for at least a decade.

Lately, my focus has been on reactivating the news archives (https://www.theninhotline.com/news/archives/). I had a lazy script that auto-generated links to archived posts, but it was just cycling through months and years, regardless of whether there was content. This worked when the site was busy, and fell apart when the site ran quiet for months at a time. Part of why it ran quiet was because the admin, written in Perl, kept tripping weird security issues on the new hosting environment, and I just did not have the time to try and debug that. Not only that, but the kind of help I would need to fix that would be from the kind of people who were probably even less interested in working on an old codebase out of the goodness of their heart.

I fixed most of those problems in the last year or two, manually, and for the most part, the admin is up and running in enough capacity that I can not only post news to the front page, but I can also post articles into the article archive. Feels good! It would be nice if there was more to post about, but I'm okay with the lull.

Before deciding to take on new features, I finally took a shot at this article archive. As the link above shows, it's now a collection of graphical representations of each year that news was posted. Someone suggested that on Twitter, years and years ago now, and I liked the idea, but I didn't want to work it out, and I didn't want to hire someone to do something that probably wouldn't quite be what I wanted.

ChatGPT to the rescue!


I have a MySQL table of news stories, spanning decades. Using PHP, HTML and CSS, I'm going to use the data from this table to display a series of bar charts, each representing a year of articles. Each bar chart will represent one year, and will contain twelve bars, each representing the number of stories associated with each month of that year. Each bar chart should be labeled with the year that it represents. The html and CSS should be mobile responsive. On desktop browsers, each bar chart should be approximately one sixth of the width of the page. On mobile browsers, they should take up about one third of the overall width of the page. The height of the individual bar graphs should each be no larger than 25% of the overall height of the page. The table is named news_archives, and the column containing the relevant date information is titled news_date

My god, if it didn't spit out HTML, PHP, and CSS that got me 75% of the way there. It was outputting weird, so I just wrote:


The above code isn't outputting the bar graphs correctly. It does not appear to group the months by year, when outputting.

ChatGPT apologized and gave me new code, and this was now 95% of what I was looking for.

Leviathant
11-30-2023, 09:24 PM
I get to work on reformatting the style sheets so that it doesn't look quite so janky, and then I start looking at the really old archives. I'm talking about the raw HTML files that were output by the Guestbook script I had hacked into a pseudo-content-management-system prior to the news script that it still runs on today, like this one (https://www.theninhotline.com/news/archives/2000.08.html). As much as I appreciated the "preserved in amber" aspect of them, it was time to pull that information into the core database.

This is something I largely avoided because it's laborious to write something that accurately parses and extracts information. Over the (ugh) decades, better tools have come out to help with this, so it's not as bad as rolling your own regular expressions. But it's still a lot of work.

Except, once again, for ChatGPT.


Data for my CMS is stored in a flat file representing several columns delimited by |:| followed by a newline character at the end of the line.
The columns are Date, author, email, time in 24 hour format, headline, a unix timestamp based on the date and time, and then the content of the post.

Here are two example lines:

<<blah blah blah data data data>>>

I have an HTML file. I need a Python script to read an HTML file, passed in via the command line, and then parse the content of the HTML file. The HTML file contains a series of tables. If the content of a table cell is a series of three numbers separated by a period - for example, 1.29.01 or 11.12.02, that should be interpreted as a date of a post.
A single line of text in Bold tags should be considered the headline of a post.
If, in between font tags, there is text like [00:01] or [19:27] this represents the time of the post.
Something that says "updated by" followed by a link is indicating the author of the post.
Any TD tag with a width of 560 contains the content of the post.
The date and time can be used to generate a Unix timestamp.

Based on this information, write a python script that parses the HTML file, extracts that data, and then outputs it into the previously described flatfile format.

Reader, as you might expect with ChatGPT, it didn't function 100% on the first go


Update the script to detect the file encoding before parsing

...so I decided to take a closer look


Add some debug code to display the status as the script is being run

It took a few more prompts, and then I got hands on with the code, and before I knew it... it was working. It would suck in an HTML file, output the weird custom flatfile format my CMS uses, and I was able to append it to the core database. I did this for all of the earlier archived months, post-Y2K, but haven't touched the 1999 stuff yet. Ironically, the code I'm working with is based on two-digit years, despite literally going to college to learn how to fix that in old COBOL programs. At the moment, I've gone far enough back that I don't want to mess around too much with loading the first six months of the site into the database. I'll get there though.

It was really cool to see how this translated into the bar graph on the archive index. The year The Fragile came out, we were cranking out news at a pace we wouldn't hit again until the release of With Teeth.

At the moment, the display of an archived month looks the same no matter what year you're in. I'm planning on revamping some of the older pages so that they reflect the design of the site in their respective years, at least to some extent.

Another thing I was able to do during this process was recover some of the September 2000 news. Regrettably, more than half of that archive got cut off during a manual update to the HTML file a very long time ago. We had moved beyond copying-and-pasting HTML code by now - I think - but were still doing manual fixes to the news page, and someone must have accidentally deleted everything before 9/19... and later the whole month of September disappeared from my archives. The Wayback Machine at web.archive.org had a cached copy of the website that included some entries from that month. I was able to pull that down, parse it with my ChatGPT-originated Python script, and now those entries are restored to the site.

I also took the opportunity to update the URLs. You used to hit links that looked like this: https://www.theninhotline.com/news/archives/backissue.php?y=04&m=7 (This still works, and I should probably set up a redirect)

Now, https://www.theninhotline.com/news/archives/year/2004/month/7 gets you to the same thing. Maybe that'll help with search engines? I don't really know anymore, but it's just a little easier on the eyes.

Leviathant
11-30-2023, 10:00 PM
It was during this explosive burst of productivity that I decided that in addition to resurrecting my old news archives, I should re-resurrect the news.nin.net archives. This was the original NIN news website, it looked like a blog two years before the word blog was coined, and is an obvious ancestor to what I ended up building. It also stopped working at least a decade and a half ago, maybe closer to two decades ago. Mostly that meant that the news page just didn't load... but one day, I went to the site, and all the internal folders were open to browse, and I found the raw datafiles that were used to generate that stuff. I downloaded them, but didn't really have much else to go with them. By 2005, Greg (who founded ETS) created an application (in C# maybe?) that parsed those files and output them in nice, readable pages that echoed the design of the original news.nin.net site - here's a link to web.archive.org (http://web.archive.org/web/20051226163612/http://www.echoingthesound.org/archives/monthly.php?ms=121997). Greg had improved on the original formula, in that you could browse by month, and there was a nav and a table of contents on the side of the page. He hosted it on the same server as ETS.

Four years later, the server hosting Echoing the Sound shit the bed, and with it went the news.nin.net archive. Greg had given me the code that he'd used to do the hard work, but my focus at the time was to spin up a new server and get ETS back online.

So, back to yesterday. Just for shits, I visited news.nin.net (http://news.nin.net/) and to my astonishment - it worked. As best as I can tell from the Wayback Machine, it went back online sometime around March 2021. I guess during COVID lockdown, Jason Patterson took time to revisit his old shit, and managed to fix what was broken, and no one told me.

Fun fact - for at least a decade, I'd email that guy every couple of years asking if he'd relinquish nin.net to me. I had a vision: wiki.nin.net, news.nin.net, catalog.nin.net, live.nin.net - bringing all these incredible fan-made sites together under one domain. Dude never wrote back. It frustrated me that he was effectively squatting on the domain, but I also totally understand how this was an important chapter in his life. Looks like he works at Instagram now, working on their mobile app - appropriate, as he was SUPER early on building mobile applications back in 1998, for Windows Pocket PCs and shit like that.

Anyway... saved me a bunch of work. What's amazing to me is that the banner ad at the top of the page still kinda works. I'm sure the original ad network has been bought and sold a few times, but whoever's been in charge has made sure that none of the ad services bombed out, all these years later.

TheBang
12-01-2023, 03:00 PM
Yeah, ChatGPT was super-useful in helping me fix hundreds of date string formatting issues in the ninlive.com RSS feed. It probably would have taken me several hours to muddle through the code myself, but ChatGPT helped rapid prototype some code in minutes. After some testing and further refinement, the job only took about 30 minutes to complete. When I went to run the code and discovered I didn't have a PHP interpreter on my machine, I also had ChatGPT convert the code to Python.

But, you do have to be careful, because it can be really dumb. I noticed that it had created an extraneous subroutine in the code that did absolutely nothing. Although it didn't affect the output, it took three prompts to get ChatGPT to properly excise that code. One time, it even apologized for the error and then gave me "corrected" code that was exactly the same as before. I also noticed that it was constructing the formatted date string manually in the code. I had to ask it to rewrite the code using the Python-native date-handling functions.

Leviathant
12-01-2023, 03:32 PM
But, you do have to be careful, because it can be really dumb. I noticed that it had created an extraneous subroutine in the code that did absolutely nothing. Although it didn't affect the output, it took three prompts to get ChatGPT to properly excise that code. One time, it even apologized for the error and then gave me "corrected" code that was exactly the same as before. I also noticed that it was constructing the formatted date string manually in the code. I had to ask it to rewrite the code using the Python-native date-handling functions.

Yeah, there were definitely a few times where I realized I was talking to a wall. And as good as it is with creating regular expressions, it's still not as good as I am. That's okay though, I've been tech lead on a small team of junior developers, for multiple projects. Same thing, but the consequences of bad code making it into 'production' are way less scary, haha.

Erneuert
12-05-2023, 05:46 AM
Great work, Matt.

I’m just so worried where this ChatGPT is going to take us, long term.