Let’s have a talk about pulling in packages.
When is it necessary?
You want to check if a number is even, in a language that isn’t type-safe. Your first intuition is to use the %
operator, but it doesn’t work. You’re frustrated. Seven beers and 20 syntax errors later, you throw away any hesitation about bloat and decide to just add another dependency. What does it matter?
There are quite literally, millions of tiny packages on npm
. is-even
is just one of them. A full package where the metadata, even the README
is longer than the code itself.
Here are the contents of is-even
in their entirety (credits to jonschlinkert
):
var isOdd = require('is-odd');
module.exports = function isEven(i) {
return !isOdd(i);
};
When you pull in is-even
, you’re not just adding a single package. Dependencies beget dependencies – a package for a package makes the whole world slower. What is is-odd
? Another package, of course. is-odd
is slightly more complex, since it checks if a value actually is a number (pulling in is-number
in the process). In total, you’re pulling in three packages to express operations that are built into the language. Even though Javascript’s type system is fickle, you would think that typeof
would solve your problem pretty easily.
var isNumber = function(x) {
return (typeof x === 'number');
}
A polymorphic function that works on integers and floats. Done. “But what if it’s NaN
!????”, you might be asking:
var isNumber = function(x) {
return (typeof x === 'number' && isFinite(x));
}
Finally:
var is-even = function(x) {
return (isNumber(x) && x % 2 === 0);
}
var is-odd = function(x) {
return !(is-even(x));
}
Problem solved. This is pretty much what the original code does, but in 20 lines of code. No triplicated metadata and git directories needed. It’s easy to add directly to your projects and your users will thank you for saving them 3 packages in node-modules
. Even funnier is that there are 45 similar packages all over npm. is-odd
is a dependency of gems such as odd
, which gets the odd-numbered items from an array (wow, thanks), and discord. Looking at the reverse and dev dependencies of is-odd
shows over 159 packages tangled within this sinister tree of triviality. (credits to broofa for the visualization tool). How deep does this go?
Obviously these are trivial examples, and there are much worse examples of the dependency flood that comes with npm
modules. My issue with this is that is-even
gets over 133,286 downloads per week, according to npm
statistics. The Github repo lists over 6.4 MILLION downloads total. How much bandwidth is being wasted on this? (Obviously, not as much as having to redownload jQuery every time you visit a website that uses a different CDN). An article from AppSignal shows npm
has over 1.2 million packages in total. That’s more than PyPI, RubyGems, Maven, Packagist, Hackage, Crates.io
, and NuGet
COMBINED. Even though JS is the majority language in terms of active users on Github, it only beats out Python and Java by around 7% (22.63%, 14.75%, and 14.01%, respectively). How in the hell does the npm
ecosystem have over 4x the packages?
52236 packages on npm
relate directly to npm. 1353 packages are for manipulating and working with comments. The npm
package itself has 123 direct dependencies, making up 1185 packages. A quick recursive wc -c
on my system shows that my npm source directory’s node-modules
is wabout 14.12 megabytes, which isn’t too bad considering that some people complain about their node-modules
reaching several hundred megabytes. Ignoring meme packages that purposely include an insane amount of dependencies (I’m looking at you, bloater), it seems like the average package doesn’t have that big of a package.json
. The most widely-included libraries, like lodash
and chalk
, are smart to remain self-contained.
So where is the bloat coming from? I have a few hypotheses:
- Versioning.
According to the docs, npm2 doesn’t make any attempt to resolve conflicting package versions like Stack
or Portage
would. Conflicting versions are placed into their own subtrees in the dependency tree.
npmv3 attempts to flatten dependencies into a single directory as much as possible, only defaulting to npmv2’s tree insertion when direct conflicts occur. Sandboxing is a pretty common solution to package conflicts across package managers, but the sheer amount of micro-packages on npm
makes this worse.
I really like pnpm
for how it handles versioning. Isolation is preferable to the alternative of complete dependency hell (see Stack
for a great example of how to do package isolation), but when you’re dealing with thousands of tiny files that each take up a filesystem block at minimum, it adds up. pnpm
installs libraries at global scope, hardlinking when you install packages that depend on them. It also supports incremental diffs between library versions, even further reducing the amount of disk space needed.
- Javascript is poop.
Let’s face it, there’s a lot to be desired with respect to the JS ecosystem. Have you ever had to write Array.isArray()
as a separate condition before using a switch statement to determine the type of an argument? Because 130 packages solve this.
> typeof [1,2,3]
'object'
Have you ever wondered why there are two different equality tests? Why is overloading so inconsistent? 365 packages. (Thanks to John Demetriou for some examples.)
> '1' + '2'
'12'
> '1' - '2'
-1
> '1' + -'2'
'1-2'
Why are string primitives different from string objects? I mean, just look at this:
let t = new String('hello');
let s = 'hello';
> typeof t
'object'
> typeof s
'string'
> t instanceof Object
true
so it follows that s is an instanceof
String, right?
> s instanceof String
false
> t instanceof String
true
But you can use String.prototype
methods on string literals, since they’re automatically coerced into objects for the purposes of property lookups (source). What?
The two different data types also return different results when passed to eval
. This has tripped me up many more times than I’m proud to admit.
Given these ridiculous rules, it makes sense that people would put together tons of libraries to try and reconcile these notions into something remotely usable. It’s the same reason that self-help books and seatbelts exist – resourcefulness and invention are the only remedies for chaos.
- The userbase.
There’s no wonder JS has all these idiosyncracies, considering the sheer amount of faith and responsibility we’ve placed on it over the years. “Do one thing and do it well” is a notion of the past, when we have one language that does it all. From driving your website UI, server-side logic, mobile apps, IDE’s, all the way to rocket controls. It’s a language that has to cater to every demographic, from beginners to project leads, across way too many use-cases for any significant and consistent general improvements to stick. At least there’s an official language standard, regardless of how unstandard it actually is.
What this means, however, is that there a lot of packages being written as exercises in contributing to open-source, or with extremely domain-specfic intentions and little reusability. For example, there are 137 libraries for bioinformatics, 2306 for game dev (with 410 pertaining specifically to chess!), 2325 for sound/music, and so-on. Looking through these results however, a lot of them aren’t useful resources as much as school projects and coding exercises. JS is a language that’s common for beginners to pick up, although whether that’s a good thing is debatable. Regardless, the sheer amount of new JS developers is staggering. This leads to a lot of packages that don’t necessarily serve any purpose beyond learning how to create libraries. Whether it’s worth it to upload them for others to integrate and build upon is debatable.
Of course, practice is the only way to learn how to code. But when you’re uploading a new package, you’re putting it out there as a free piece of software for others to adapt and integrate into their own projects. I’m sorry to say this, but your todo-list app isn’t useful to others, and it probably belongs on your portfolio rather than publicsoftware distributions. If you wrote a tool to dynamically generate todo-list apps, however, there’s a lot more potential there. npm
and other repositories just aren’t meant to be replacements for your personal Github page – what if you went to a hardware store to buy a hammer, but all they had were prefab houses?
The npm
ecosystem is way too bloated for any meaningful and fair cleanup to be done at this point. I don’t think any package repositories really have detailed moderation systems in place besides dealing with malware, but please let me know if I’m wrong. There’s a similar problem in Haskell, where a lot of packages are written as Ph.D theses and subsequently abandoned. There aren’t nearly as many Haskell developers for this to reach the same extent as npm
’s example, though. There is complacency in numbers.
In short, the next time you open package.json
or stack.yaml
, consider whether or not you’ll actually need that dependency. If it’s just for convenience, it helps to stop for a moment and make sure you’re not inadvertently doubling the size of your node-modules
.
I was stewing over this problem a few weeks ago, and decided that trivial packages should at least be refactored into something less-trivial. I hereby present my Haskell
implementation of is-even
:
We start off with some standard imports. Nothing too fancy here.
{-# LANGUAGE FunctionalDependencies, MultiParamTypeClasses, LambdaCase #-}
module Data.Numbers.IsEven
( isEven,
isOdd
) where
import Control.Monad.Trans.State.Lazy
import Control.Monad
import System.Random
import Data.Profunctor
import Data.Functor.Contravariant
import Data.List (foldl')
import Data.Bits
First, a simple recursive solution. is-even(n)
is just the opposite of is-even(n - 1)
, after all.
evenRec :: (Integral a) => a -> Bool
evenRec = go True
where go b = \case
0 -> b
x -> go (pred x) (not b)
Explicit recursion is an anti-pattern.
evenFold :: (Integral a) => a -> Bool
evenFold = foldl' (const . not) False . enumFromTo 0
foldl
is also an anti-pattern. Let’s get funky.
evenState :: (Integral a) => a -> Bool
evenState n = execState (replicateM (fromIntegral n) flipper) $ True
where flipper = state $ \b -> ((), not b)
evenIt :: (Integral a) => a -> Bool
evenIt = fst . (iterate (uncurry (flip (,))) (True, False) !!) . fromIntegral
This was the dirtiest part of my implementation, and I’m really sorry to anyone reading this.
evenNaive :: (Integral a) => a -> Bool
evenNaive = even
Laziness or: how I learned to stop worrying and love the garbage collector.
evenLazy :: (Integral a) => a -> Bool
evenLazy = last . zipWith const z . enumFromTo 0
where z = True : False : z
This was my first time using profunctors, and I seriously doubt this is the actual use-case for them. Essentially, we compose from left to right, a series of subsequent applications of not
to False
.
dimap
takes two functions and a profunctor, which is contravariant in its first value and covariant in its second. dimap
is honestly redundant here as we accomplish the same thing (applying not
n times) in the next block.
data EvenP a b = EvenP {
contr :: BoolC a,
covar :: b
}
newtype BoolC a = BoolC {runBool :: a -> Bool}
instance Contravariant BoolC where
contramap f (BoolC g) = BoolC (g . f)
instance Profunctor EvenP where
dimap f g (EvenP a b) = EvenP (contramap f a) (g b)
evenProf :: (Integral a) => a -> Bool
evenProf n = let m = EvenP (BoolC not) 0
dmp = dimap not succ
in (runBool . contr . head . dropWhile ((< n) . covar) . iterate dmp) m False
-- build up a huge stack of nots from the contravariant argument of EvenP, finally applying it to False
take me to church
data Peano = S Peano | Z deriving (Eq, Show)
addPeano :: Peano -> Peano -> Peano
addPeano Z s = s
addPeano (S a) b = S (addPeano a b)
instance Ord Peano where
Z <= S _ = True
S a <= S b = a <= b
instance Enum Peano where
toEnum = (iterate (addPeano (S Z)) Z !!)
fromEnum = length . takeWhile (/= Z) . iterate (\(S a) -> a)
evenPeano :: (Integral a) => a -> Bool
evenPeano = let go b = \case
Z -> b
S a -> go (not b) a
in go True . toEnum . fromIntegral
newtype Church a b = Church { runChurch :: a -> b }
toChurch :: (a -> a) -> Peano -> Church a a
toChurch = (Church .) . go
where go h = \case
Z -> id
S a -> h . (go h a)
evenChurch :: (Integral a) => a -> Bool
evenChurch = flip (runChurch . toChurch not . toEnum . fromIntegral) True
-- toChurch is just a more generalized version of evenPeano generalized to any unary endofunction
Since we’re pushing the limits of trivial computer science, why not implement a solution at the lowest level? (of abstraction, not usefulness.)
evenBits :: (Integral a) => a -> Bool
evenBits = toEnum . xor 1 . (.&. 1) . fromIntegral
Let’s step it down a notch.
These infix symbols look close enough to their circuit diagrams, I swear.
class Nand a b c | a b -> c where
(|@.) :: a -> b -> c
instance Nand Bool Bool Bool where
True |@. True = False
_ |@. _ = True
infixr 8 |@.
class Xor a b c | a b -> c where
(||>) :: a -> b -> c
instance Xor Bool Bool Bool where
a ||> b = (a |@. ab) |@. (b |@. ab)
where ab = a |@. b
evenGates :: (Integral a) => a -> Bool
evenGates = foldr (||>) True . flip replicate True . fromIntegral
Here it is. We’ve defined our toolkit, our framework on top of which we can finally accomplish our goal: to check if a number is even. One-liners be damned.
-- | Returns True if the number is even.
isEven :: (Integral a) => a -> Bool
isEven n = let a = (abs . fromIntegral) n
g = mkStdGen a
f = case ((!! a) . randomRs (0 :: Int, 11)) g of
0 -> evenBits
1 -> evenPeano
2 -> evenProf
3 -> evenNaive
4 -> evenRec
5 -> evenFold
6 -> evenState
7 -> evenChurch
8 -> evenLazy
9 -> evenGates
_ -> evenIt
in f a
And the last step is defining our library for any integer:
-- | Returns True if the number is odd.
isOdd :: (Integral a) => a -> Bool
isOdd = not . isEven
Check out the source code if you like. For obvious reasons, this is staying off Hackage.