Microsoft adds Python to Excel

Several years late, it seems M$ is considering making Python native to Excel.

I wrote a … you guessed it … Quora answer as to why this makes sense for Microsoft.

—-

Python has become massively popular with the data and machine learning communities in the last few years.

Tools like JuPyter are increasingly popular and serious interfaces for data-modellers who previously would have used Excel.

It absolutely makes sense for Microsoft try to embed Excel and itself into that emerging Python data ecosystem by making Python a first-class citizen (ie. default, guaranteed to be there) of Excel.

Not just a third-party add on for those who know about it and can make the effort to install it.

Not only does Python need to be standard within Excel, but access to pip and all the Python libraries needs to be there too. So that Excel becomes the equivalent of Anaconda

That’s the way that M$ can keep Excel relevant in the new data age.

This is not only a good idea for Microsoft. It’s the difference between Excel remaining a major player in data modelling and analysis tools, vs. declining into obscurity.

Gmail Snooze

Is GMail Snooze basically building Mind Traffic Control into your email client?

Well, it’s about time Google did something radical to improve GMail. There’s still so much untapped potential in the mail-box. And at least email is an open protocol that we should defend against moving to walled rivers like Facebook and Slack etc.

Working with the file system is too verbose. Let’s make it more like JQuery!

Working with the file system in Python is too verbose.

Seriously.

Every time I want to do something with files I need to remember whether it’s the `os` or `sys` library that I need to import. I need to remember or look up half a dozen other functions to grab particular bits of metadata from those files.

I’ve written countless nested loops or recursive functions to walk the tree of files over the years.

And often I want to do something quick to a bunch of files, start writing shell-script. Then realize I hate and can’t remember shell-script and think it would be so much easier in Python. Then I open the editor to try to write my code in python, and I realize it’s too much trouble and go back to mashing around in bash again.

And then I remembered JQuery with its refreshingly easy abstractions for maneuvering in, and manipulating a tagged tree-shaped data-structure. And the file-system is just a tree, right? So why does it have to be hard? Why shouldn’t working with files just be like working with JQuery?

This is such an obvious idea, I’m sure someone must have done it.

But I couldn’t find it. So I had to write it myself a couple of months ago.

So, let me present … FSQuery . Now available on PyPI and GitHub.

Here’s what it looks like :

from fsquery import FSQuery

fsq = FSQuery(path).Match(".js$").NoFollow("vendor").FileOnly()  
for n in fsq :
    print n.abs

You create an FSQuery object with a path to the root of the files you’re interested in. You then load up extra filters / queries on the query by chaining them together.

Finally you can treat the whole thing as an iterable collection and loop through its results.

It returns object of class FSNode, which represent nodes in the file-system, either files or directories. The modifier FileOnly(), restricts the query to only return files. You only need to add this once to the query.

The NoFollow() method tells the query to avoid directories that match the name. But has no effect on file names. In the above example, “vendor.txt” would still be included in the results if that file is anywhere other than under the vendor directory. (This beats just piping find through grep in the terminal.) You can add as many NoFollow filters as you like to a query.

On the other hand, Match() is an inclusive filter. Only files whose names are explicitly matched end up in the results. However, this filter isn’t applied to directories. FQuery will still explore and return directories whether they match this or not. You will usually want to combine a Match with a FilesOnly to get the effect you want (eg. in this case, all the .js files anywhere except under the vendor directory)

We can even look inside files with the Contains filter, eg.

fsq = FSQuery(path).NoFollow("vendor").Ext("py").Contains("GNU Lesser General Public License").FileOnly()

Note that this is implemented purely in Python in a very non efficient way. (Ie. I just open up each file and run through it looking for the string.) It can be slow with large chunks of the file system.

Note also the Ext() filter for file-name extensions. This is easier than regexing the whole file-name if you’re just looking for files of type “py”. Be aware that if you try to have two Ext() filters on the same query, you will get no files returned. No file can have two different extensions at the same time.

More documentation, and more advanced tricks can be seen on the GitHub site. This is also the first library I’ve put on PyPI. So installing in your own project is as simple as

pip install fsquery

A Replacement for PHP

I asked Quora for ideas for how a replacement PHP might look.

On the whole people are not enthusiastic. Alexander Tchitchigin had an interesting answer, but which focused on the basic theme of “once we move away from PHP’s weaknesses, we might as well use any language.

Which prompted me to write this comment elaborating what I was interested in. Plus some ideas of my own.

I agree that there are downsides to “embedded in HTML”.

But I think we can also see various what I call “pendulums” in computer science. For example between centralization and decentralization. Centralization gives you economies of scale, eliminates redundancy and makes it easier to see the wood for the trees. Decentralization makes it easier to modularize (or divide and conquer), easier to test and find bugs, easier to scale, easier to improve individual modules etc.

The pendulum oscillates because whenever one of these principles becomes more dominant, everyone starts to feel the pain and see the attraction of the other pole. And then stories start proliferating of the virtues of shifting the other way. Once everyone does, of course, there’s a pull back to the first way again. And so the pendulum continues to swing.

Right now I’m seeing this centralized / decentralized tension in ClojureScript web-frameworks. Comparing using Reagent directly with devcards vs. using Re-frame. Trying to decide whether the convenience of the modularity of keeping state decentralized in individual reagent components and being able to use devcards, outweighs the extra transparency of centralizing state in the re-frame db.

Now I think this thing about “MVC vs. templates with code” is another pendulum rather than an absolute principle.

At the end of the day, HTML is the data-structure for the application’s GUI. And the GUI data-structure does need to be fairly tightly integrated with the code. It doesn’t make sense to try to decouple them too much. You need a button attached to a handler attached to some transformation in your business logic. There’s no point trying to keep these things apart. A button without functionality makes no sense. Nor does functionality that can’t be accessed through the UI.

I’m now using Hiccup to generate HTML. And the place to do it is obviously tightly integrated with the actual functionality of the app in the code itself.

Yes, there are still some MVCish intuitions at work. But I don’t need or want a language to try to hard enforce that separation between UI and functionality either in separate files or with separate languages, when a simple DSL in the main programming language is sufficient.

But when you look at a Reagent (ie. React) component it’s basically a Huccup template with some extra code in it.

Now the brilliance of PHP, the reason it’s so popular is that :

a) it’s just there, pretty much always.

b) it simplifies simple sites considerably by automatically mapping the routing onto the directory structure of the file-system. There are many cases when that is fine. Why should I have to hand code a whole layer of routing inside my code when the file system already provides me with a logical hierarchical layout?

I think there’s still value in the “map files to pages” part of PHP. And that’s what I’d expect a “new PHP” to keep. Along with the “available everywhere” bit.

Of course, how it might look, might be more like Hiccup, a light-weight DSL rather than the verbosity of HTML. With each file implicitly mapped to a React component. Perhaps something like I started describing here : Phil Jones’ answer to What’s the best programming language for applications and GUIs?

In that other answer, I talk about what I found interesting in Eve : the event-handling within the language through “when” clauses and an implicit underlying data-structure.

And I tried to sketch what a language that brought events to Hiccup might look like :

 <pre>
    (defcomponent click-counter
        {:localstate ['counter 0]
         :when (on-click "the-button") #(reset! counter (+ @counter 1))
         :view
          [:div
             [:p "Counter clicked " @counter " times"]
             [:button {:id "the-button"} "Click Me"]
          ]
        }
    )
</pre>

Many frameworks encourage you to put components etc. in different files and directories anyway. Why not make this “official” in the same way that Python makes indentation official. And use the directory structure to infer the program structure?

A powerful modern language with the easy accessibility defaults of PHP would be a powerful combination.

August is Patterning Month

August is Patterning month again.

I’m back to work on the Patterning library. And, in particular, getting it working properly in the ClojureScript, in-browser version. I’m going to be using devcards, figwheel, spec and other good tools in the Clojure community.

I’ll be revamping the site and new versions of the code.

Watch this space …