Using Westlaw, I find six reported cases that mention SHA-1 hash values.
Three merely note that the United States “adopted the SHA-1 has algorithm . . . as a Federal Information SMWgo2 Processing Standard.” U.S. v. Schmidt, 2009 WL 2836460 (U.S. District Court for the Eastern District of Missouri 2009). See also U.S. v.. Stevahn, 2009 WL 405847 (U.S. Court of Appeals for the Tenth Circuit 2009); U.S. v. Warren, 2008 WL 3010156 (U.S. District Court for the Eastern District of Missouri 2008). Another explains that a Pennsylvania State Trooper “was able to” use SHA-1 hash values to identity child pornography images (more on that in a minute). U.S. v. Sutton, 2009 WL 3542446 (U.S. Court of Appeals for the Third Circuit 2009).
That leaves U.S. v. Beatty, 2009 WL 5220643 (U.S. District Court for the Western District of Pennsylvania 2009) and U.S. v. Schimley, 2009 WL 5171826 (U.S. District Court for the Northern District of Ohio 2009). In these cases, SHA-1 hash values played a role in a defense motion challenging some aspect of the prosecution.
We’ll start with U.S. v. Beatty. Since I described the charges and facts in the case in an earlier post, I won’t repeat them here. That post dealt with whether Beatty had standing to raise a 4th Amendment challenge; we’re dealing with a different issue here.
As I noted in the prior Beatty post, he moved to suppress evidence, arguing that the affidavit submitted to obtain the warrant didn’t establish probable cause for the issuance of the warrant. Beatty conceded that the affidavit submitted by FBI Agent Brenneis
supported a finding of probable cause to believe: (i) that the files which Trooper Pearson located through peer-to-peer networking were located on the [his] computer and (ii) that they matched the files in the Wyoming ICAC Task Force's national data base. `But what the affidavit fails to do,’ according to [Beatty], `is provide any information for [the issuing magistrate judge] to use to determine that there was a fair probability that those files were contraband or evidence of a crime.’
U.S. v. Beatty, supra (quoting Beatty’s motion to suppress). The prosecution relied, in part, on the “highly graphic titles of the files [Beatty] was making available to other P2P users”. U.S. v. Beatty, supra. In response, Beatty pointed out that people can name a file anything they want and that some file names are inherently ambiguous. In considering this issue, the district court judge noted that the file name “`“Lolita,” . . . could as easily reference an English term paper, a discussion of teacher-student relations, or . . . child pornography. Likewise, in a vacuum, the title “Teen Angel” could as likely reference a popular 1960s song as it could be a video file containing child pornography”. U.S. v. Beatty, supra (quoting U.S. v. Leedy, 85 M.J. 208 (U.S. Court of Appeals for the Armed Forces 2007)).
The judge, however, found it does not necessarily follow that
no file name can ever be regarded as a logical indication of the file's salient features. Just as one can envision circumstances where a particular file name might provide no basis for drawing inferences concerning the actual file content, one can also envision circumstances where the file name is so explicit and detailed in its description as to permit at least a reasonable inference as to what the actual file is likely to show.
U.S. v. Beatty, supra. Beatty, though, relied on comments Agent Brenneis included in the affidavit he used to obtain the warrant. In the affidavit, Brenneis explained that the SHA-1 hash value
is a `mathematical algorithm that allows for the fingerprinting of files.’ Once a file is located using a software application capable of generating the SHA1 value, that SHA value becomes a unique identifier for that file. . . `There is no known instance of two different computer files having the same SHA1 hash value.’ . . . [T]he SHA1 `digital fingerprint’ is `more unique to a data file than DNA is to the human body.’
U.S. v. Beatty, supra (quoting Agent Brenneis’ affidavit). In his argument, Beatty focused not on the identification capabilities of the SHA value but on his claim “that the names of files . . . cannot serve as meaningful indicators of the actual file content.” U.S. v. Beatty, supra. He specifically relied on this paragraph in Agent Brenneis’ affidavit:
[A]n investigator can be certain that an image being disseminated on the Gnutella network is child pornography simply by comparing that subject image's SHA1 hash value with a national database's listing of SHA1 hash values for known child pornography. This allows an extremely high degree of confidence that a known hash value represents a given file . . . regardless of the title utilized by the distributor or possessor of the image.
U.S. v. Beatty, supra. Beatty claimed that “this language demonstrates that `[t]he whole point of using the SHA1 hash value to identify files rather than file names is that the file names do not necessarily correspond to the file contents.’” U.S. v. Beatty, supra (quoting Beatty’s motion to suppress). The district judge didn’t agree:
When considered in context . . . the cited language merely establishes that use of SHA1 values provides an extremely high level of precision in identifying specific file content -- a level of precision which, according to the affidavit, is more unique than DNA matching. Such precision likely exceeds the exactitude necessary to establish proof beyond a reasonable doubt; certainly, it exceeds what is necessary under . . . probable cause standards. Thus, the affidavit may be fairly read as implying a fairly obvious principle -- that SHA1 values provide a more reliable means of identifying actual file content than is possible by virtue of file names alone. This principle, however, does not lead ineluctably to the conclusion that file names thereby always constitute meaningless information.
U.S. v. Beatty, supra. The judge therefore found that the file names were a factor the magistrate could properly consider in making a probable cause assessment. U.S. v. Beatty, supra.
That brings us to U.S. v. Schimley. After being charged with receiving, distributing and possessing child pornography in violation of 18 U.C. Code § 2252, Mark Schimley moved to suppress evidence and for a Franks hearing. U.S. v. Schimley, supra. As I explained in an earlier post, in Franks v. Delaware, 438 U.S. 154 (1978), the Supreme Court held that if a defendant proves by a preponderance of the evidence that an officer knowingly or with reckless disregard for the truth included false information in an affidavit used to get a search warrant, the court must decide if the affidavit would have justified issuing the warrant if that information hadn’t been included. So Schimley not only filed a motion to suppress, he also sought a hearing under Franks v. Delaware, which brings back to the origins of his case.
In 2007, Pennsylvania State Trooper Erdely was using Phex software to troll a file-sharing network for child pornography. U.S. v. Schimley, supra. He found “3929 files for download” originating from a network he traced to Schimley and was “able to identity sixty movie and images files that contained names related to child pornography, based on his previous experience in other” cases. U.S. v. Schimley, supra. An FBI Agent, Agent Russ, used what Erdely had found to obtain a warrant to search Schimley’ home and seize his computer, which apparently contained child pornography. U.S. v. Schimley, supra.
After being indicted, Schimley moved for a Franks hearing, claiming that the affidavit contained false statements made knowingly or recklessly; the prosecution conceded that it contained two false statements, both concerning a file named "[Loli Child Porn] (Loli Y) Babj 00(New) by Kidzilla.avi." U.S. v. Schimley, supra. The affidavit (apparently submitted by Russ) said Erdely had found this file on Schimley’s computer, but when the computer’s hard drive was searched the file wasn’t there. U.S. v. Schimley, supra. The prosecution blamed Phex:
[Erdely] maintains a text file which contains the names and hash values of known child pornography images recovered from other investigations. When [he] enters a search term into Phex, the search results will typically reveal tens or hundreds of Phex users sharing a file containing the search term. The list would include multiple users sharing the same file, although the file may be saved under a different file name. If the search results include a file with the same name or hash value as a file stored in the text file, the name from the trooper's text file will be assigned to the image he selects for download.
Thus, according to the government, when the trooper downloaded the suspect file from Mr. Schimley's IP address, that file was cross-referenced against his text file, either by file name or hash value, and assigned a name as specified in the text file.
U.S. v. Schimley, supra. The judge denied Schimley’s motions to suppress and for a Franks hearing because she held that he hadn’t shown the false statements were knowingly or recklessly included in the affidavit. U.S. v. Schimley, supra. Schimley moved for reconsideration of her ruling, asking the judge to order a Franks hearing because she had
relied on the government's claim that the SHA-1 hash value of the downloaded file matched that of the file being shared at Schimley's IP address. He states that under Supreme Court precedent, the Court may not rely upon information outside the four corners of the affidavit in its probable cause determination. Whiteley v. Warden, Wyoming State Penitentiary, 401 U.S. 560 (1971). He insists that whether the SHA-1 hash values match is an issue of fact only a Franks hearing can resolve, and that the Court cannot simply rest on the Government's assertion that the files did in fact match.
U.S. v. Schimley, supra. The judge agreed that the probable cause determination had to be based on what was in the affidavit but still rejected Schimley’s argument:
The Court ruled the way it did, not because the government claimed that the SHA-1 hash values matched, but because the defendant was unable to meet his burden. . . . Schimley essentially argued that Agent Russ' actions were necessarily intentional or reckless simply because the affidavit contained false statements. As the Court noted then, `Franks teaches that a mere showing of falsity is insufficient to demonstrate recklessness on the part of the affiant.’ Because Mr. Schimley did not meet his burden, the Court concluded, as it does now, that a Franks hearing is not warranted.
Furthermore, . . . the Court again concludes a Franks hearing is unwarranted, because the false statements were not material to [the] finding of probable cause. . . .
[W]hile Agent Russ did not specify the actual SHA-1 hash values, he did attest that Erdely had taken steps to ensure the values did in fact match. Schimley has not otherwise met his burden that the warrant was insufficient. . . .
U.S. v. Schimley, supra. The judge also rejected Schimley’s claim that he was entitled to a Franks hearing because Erdely used a “modified version” of Phex. U.S. v. Schimley, supra.
His expert said it was “evident Erdely was not using a standard version” of Phex because the “software does not typically rename files in the way the” prosecution said Erdely’s version did. U.S. v. Schimley, supra. Schimley wanted a Franks hearing to question Erdely’s “`findings and methodology . . . through cross-examination’” but the judge decline to grant his request. U.S. v. Schimley, supra. She found that Schimley didn’t (i) show how the omission of this information misled the magistrate who issued the warrant into believing there was probable cause when there wasn’t and (ii) specifically allege “what difficulties he has encountered in attempting to validate the investigation”. U.S. v. Schimley, supra.
I find it interesting that the (apparently only) two reported cases that use SHA-1 values in defense motions arose last year. I wonder if we’ll see more use of this and related issues by defense attorneys.