Media Files
Title:
Fixity Webinar
Collection:
AVP Videos
Organization:
AVP
Duration:
00:58:42
Agent:
Creator AVP
presenter Amy Rudersdorf
Date:
created 2017-02-08
Keyword:
checksums
fixity
tutorial
webinar
md5
SHA
digital
digital preservation
Format:
video
Description:
general
This is the recording of an AVP webinar given on the topic of the Fixity application by Amy Rudersdorf on February 8th, 2017. Webinar description: How do you know if your digital files are corrupt, missing, moved or renamed? We invite you to join us to learn how Fixity will allow you to monitor and report on file integrity and attendance. How does it work? Fixity scans a folder or directory, creating a manifest of the files including their file paths and their checksums, against which regular comparative analyses are performed. Fixity then emails a report to the user documenting flagged items along with the reason for a flag, such as that a file has been moved to a new location in the directory, has been edited, or has failed a checksum comparison for other reasons. When run regularly, Fixity becomes a powerful tool for monitoring digital files in in almost any storage location. Download and find out more about Fixity at weareavp.com/products/fixity/

Language:
Primary English
Date:
created 2017-02-08
Agent:
Creator AVP
presenter Amy Rudersdorf
Rights Statement:
CC BY 4.0

Publisher:
AVP
Duration:
00:58:42
Preferred Citation:
Fixity Webinar, AVP, 2017

No index available for this file.
Hi everyone! Thanks for coming today. We welcome you,
a couple of things before we begin. My name is Amy Rudersdorf, and
I'll be presenting today. We wanted to let you know a couple of
things you can see among the screen in front of you,
but I thought, I stepped through them quickly. If you want to enter
the full screen mode in GoTo webinar, you're gonna wanna look for that
little icon with the four arrows.
They are going to be a lot of details in the presentation today.
So being in full screen mode will probably help you see them.
Also if you have questions you'll want to find the question mark icon in
that panel on the right of your screen and you can ask questions
there, and then finally the webinar is going to be recorded and it
will be available online after we're done today.
With that, I will jump into the presentation
today, we're gonna be talking about Fixity, this is the third in our
series of webinars on some of the free tools AVPreserve has developed to
support collection management and digital preservation. If you're not familiar
with AVPreserve we are a data management consulting and software development
firm, we're focused on advancing the ways in which data is used for
the benefit of individuals and organizations and causes.
So today's webinar is on Fixity which is the tool to support the
long term management of digital files.
So here's what we're going to cover today, we'll talk a little bit
about what Fixity is, go over some definitions so we're on the same
page and then we'll talk about why to use Fixity, and we'll go
through some demos of how you actually use Fixity, and can use it
in your day to day work, and then finally some details about where
you can find it and how to learn more.
So, Fixity was created with the sole focus of fulfilling the requirements
of those concerned with monitoring and managing the fixity of a collection
over the long term.
There are a lot of free and open source checksum utilities out on
the market, but they don't offer the feature set or simplicity necessary
to fulfill the needs of many organizations. So, Fixity is intended for use
in monitoring collections of files that are in a final state or are
ready for deposit into an archive or preservation oriented repository.
Fixity scans a folder or directory and creates a manifest of the files
within it. This includes the file path and checksum and against those
it does a regular comparative analysis it runs that analysis over and over
again. It monitors file integrity through generation and validation of those
checksums and also file attendance through monitoring and reporting on new
or missing, moved or renamed files. And we'll talk quite a bit about
that.
So I'd like to go through a few definitions before we get started
with the demos, to make sure that everyone understands what I'm talking
about. When I say things like file fixity, file integrity, checksums, things
like that.
So first of all, file fixity or file integrity are digital preservation
terms, and they refer to the property of a digital file being fix
or unchanged.
Fixity checking, which is what the Fixity tool does.
among other things, is the process of verifying that a digital file has
not been altered or corrupted. So during transfer of a file,
for example, a repository may run a fixity check to ensure that file
has not been altered in route within a repository, which is the example
we're gonna talk about today, Fixity is used to ensure that digital files
have not been affected by bit rot or accidental or malicious changes to
the file.
By itself, fixity checking does not ensure the preservation of a digital
file, instead it allows the repository to identify which files are corrupted
so that an action can be taken. So if a file is corrupted
it could be replaced with a back up.
In practice, a fixity check is most often accomplished by computing checksum
values for a file, and then comparing them to a stored value.
So, checksums, then are digital fingerprints of files used to test fixity,
so the smallest change to a file will cause it's checksum to change,
so you can see how that would be useful
and when tested over time, checksums can also monitor for file attendance,
that's when
that's to make sure that files have not moved from their intended location
that's what file attendance means. The files are where they're supposed
to be.
So there are several different checksum algorithms and the two examples
here are MD5 and SHA 256 and those are the two checksum algorithms
that are available through the Fixity tool.
I was talking about the difference between MD5 and SHA 256. And so,
basically I'll be using MD5 today because it really is sufficient
for the purposes that we'll be using it for
and it's well supported in tools like Fixity, and it's easy to calculate,
is quick to calculate.
So, Fixity uses checksums to test for file integrity and attendance,
and when run routinely it's a powerful tool for monitoring digital files
in long term storage, so that's a great reason to use it.
Okay, so now we'll walk through setting up Fixity, creating a project,
and then we'll run through file integrity and file attendance checks several
times, so you can see how Fixity monitors change over time.
So we'll run the scan process, which I'll explain to you several times
throughout the next few minutes for the
demo. The first step in using Fixity is to download it and install
it, you can find it on our website, which is avpreserve.com under the
tool section, and if you scroll down to Fixity, you'll see that we
have versions for Windows and Mac and there's also source code available
on GitHub, there's also video tutorials and the user guide there,
so when you download it on the Fixity the executable files,
as well as some other folders for your future projects and reports,
and schedules are stored within the parent directory.
This directory can be placed anywhere on your computer, but I has to
stay formatted as it is, if you move the Fixity executable or any
of the directories from the parent folder, the program won't function.
So we recommend that you put it in your applications folder,
but for the demo today, I'm putting it next to my digital collections
folder just for ease of use, but you can create a shortcut from
applications to anywhere on
your drive so that you don't have to go to applications every time
to use it.
So the Fixity directory looks a little bit different on Mac and PC.
I'm using a Mac today, so you'll see here that the history and reports
folders
are there as well as some licensing information and user guide and the
app itself.
On a Windows machine, there are actually some other files as well,
but the core files that you're going to use are that Fixity app sand the
history and reports directory. If you first open this
folder on your Mac and that history and reports folder directory,
the directories aren't there, that's okay, they're actually created the
first time you run the app. So I'll show you how that works
next.
So, just to reiterate, it's important that you not change the names of
any of these directories, or files because the application will not work
if you do that.
So here's an example where we're opening the Fixity app for the first
time, you'll see the history and reports folders are being created,
and again on the Windows machine those files and a few others will
be there the first time you look in the directory,
so the Fixity GUI looks like this. And I'll talk about what each
of the sections are from left to right. You'll see that there are
four columns or boxes and the first one is the projects box
and it's the first one on the left. As you create and save
new projects they'll appear in this box and we will create a project
after I go through this explanation of the interface.
The next box second from the left is scheduling and the power of
Fixity is that you can schedule the generation and validating of checksums
and testing for file attendance here
using the radio buttons you can choose between monthly, weekly, or daily
scans and then the appropriate date time options will appear.
So in this section as well, you'll see that there is an option
to email only upon warning or failure, so
you can set up recipient email addresses. So every time the system runs
a scan, you can get a report on that scan.
So that's a lot of reports if you're running things weekly or daily,
so you have the option here of only emailing when there's an error
in a report.
And then finally Fixity displays the date and time of your project's most
recent scan. But in this case, we haven't scanned anything, so no data
appears. In the third column, you'll see directories and there are seven
text boxes next to those text boxes are two buttons, one of which
is the three dot or ellipse box, and that's the button where you
click to actually set up which directory you want Fixity to scan,
to monitor, so we will do that in a second, but you have
up to seven directories that you can have Fixity monitor for any project.
And then finally, as I mentioned, you can set up email addresses where
those reports for the monitoring processes go, and
you can have up to seven email addresses, as well,
so you'll see it that recipient email address that last column,
the boxes are grayed out, and that's because we haven't set up our
email settings yet and in order to actually put an email address in
one of these text boxes, you have to set up your preferences for
email, so we'll do that real quickly and then we'll actually be able
to set up a project and run some scans and monitor our files.
So let's set up email preferences, so we can create a project.
So without setting this preference Fixity will not be able to send report
emails. So this is really key to this process,
so you'll note that Fixity it's important to note that Fixity will encrypt
credentials of your email before storing them in order to provide account
security, so we'll go to Preferences, and email settings and we'll type
in our email information and you'll see that it's set up to handle
Gmail to start with, and you'll have to change those settings for your
email if you don't use Gmail, I'm checking the credentials to make sure
that the system is set up correctly and then I save my credentials.
And you'll hit save and close to do that,
and then we can
check our email to see if we got our email credentials
and you can see that you get a report that looks like this.
If the email settings are successful, so if you don't see this,
you should definitely check your spam folder before you start troubleshooting,
and if you don't get this, then there's something wrong with the setting
for your email and so you need to troubleshoot that. You may need
to talk to your IT department to get the right information,
but it should look the same as your internal email system settings. Okay,
so finally, let's go ahead and set up a project so we can
begin monitoring our files. First, we'll go to File in the menu and
select New Project and we'll name it here, I'm naming it Fixity Demo, and
then we will set up a schedule for how often we want the
project to scan our file. So I'm choosing weekly on Sunday
mornings at 3 AM, note that the system works in military time,
so 3 PM would be entered as 1500 hours. I'm opting to receive
email messages when there's a problem with the scan,
and I will then select two directories from my digital collections folder
that I want to have monitored on a weekly basis.
And then I will enter my email address as the recipient for the error
messages,
and then I've more or less finished setting up my project,
so I'll go back to the File menu, and choose Save Settings
or not done yet. We also have the option of choosing which checksum algorithm
to use. So by default it's set at SHA 256. So if we
go up to preferences and select checksum you can see we can change
it to MD5 and save those settings. And then finally, I want to
filter out any unwanted or hidden files, like the thumb.db or the.DSStore
files so I can do that by selecting Preferences,
and then filter files and choosing Ignore Hidden Files,
and then saving that. So now we can run our first scan on
the selected directory. We don't have to wait until Sunday morning,
we can run a scan whenever we want by selecting File,
and Run Now, by default, the project has always saved before the scan
is run, and the scan is completed when the console window closes.
The result of the scan are emailed to the recipient indicated in the
project if that option was selected and then stored in the history and
reports directories in the Fixity folder. Each time a project is run and
the files are scanned a snapshot of the manifest data is saved to
the history folder, so as a scan continue continue over time and monitoring
happens
over time, more and more snapshots are created and an audit trail is
produced. Previous snapshots can be referenced and a project can revert
to an earlier snapshot, if necessary, through the Import Project function,
so let's take a look at the history directory.
Since we've only run one scan, there is just one snapshot, snapshots are
named with the date and time and the project name that they were
the date and time that they were produced, and the project project name.
The snapshot is a tab delimited file, or a TSV file,
and you can view it in your favorite text editor, or from Excel,
so we'll look at it in Excel for ease of doing.
History snapshots contain the directories that were scanned, and that's
at the top, the recipient email address, and that's line two, schedule information
including the date and time of the run and filter file and checksum
algorithm preferences,
each line following the checksum algorithm, so that's MD5 in line six, contains
the hash value in the first column, so that's the checksum value for
each of the files that was scanned,
the file path, so you can see where these files are located on
my computer, and then the location in the file system index for each
file scanned.
The Reports folder contains all of the reports that track changes to files,
since we've only run one scan there is just one report,
and the reports are named the same way with the project name,
and the date and time they were produced and like the history snapshot,
they too have delimited files, and you can view them,
as you would a tab delimited file or a text file either
in a text editor or using Excel, and again, we'll look at it
in Excel.
And so this is the first report that we've run
and it contains the Fixity report, you'll see that
that's the name of it, there's the project name, and that's where the
name of the project for the report was generated. So in this case,
Fixitydemo, and then again the algorithm used the date, and then here you'll
see in line six the total files and those are the total files
scanned or removed, so that number may not actually equal the number of
files in a directory and I'll explain that a little further later.
Confirmed files are files that are unchanged. So
that's sort of synonymous. And then there's also a list of things that
can happen to a file, it can be moved or renamed,
it can be new, changed or removed, and all files and file paths
are listed with their status. So keep an eye on this as we
scan more files and how this changes.
Now let's see what happens when files move or change. So you'll see
here that we're adding a new file to the directory
and then we'll change the name of another file,
and I will move to that other directory, where we're also monitoring file
and delete something from there,
and then we'll head back to Fixity and run the project again to
see what happens in the manifest snapshot and the report.
So from the File option
in the menu, I'll select Run Now again, and then remember that the
project setting is always saved before the scan is run and once that
console window closes, we can check out the
report. So they are now going to be two reports in each folder
one for the first scan and one for the scan we just ran.
The reports will continue to accrue as Fixity monitors the collections over
time.
So here's what that new report looks like, and you can see that
we've got all kinds of information being reported here.
One file has been renamed and that's "green beans", you saw that happened.
One file was removed and that's indicated here and a new file
is also indicated in row fifteen and row nine, note that although there
are only ten files total, because we added one and deleted one and
we started with ten,
all the files that were changed or confirmed are recorded in line six,
total file, so there's actually eleven, to account for that file that was
deleted,
And here's a detail of that, so that you can see
that information a little closer.
In the manifest snapshot, in the History directory, the file we deleted
is gone and the new file "junk food",
is there, note that the checksum for the file we renamed does not
change, so the "green beans" file changed name and
the checksum doesn't change, and that's as you would expect.
So here's the highlighted file in snapshot two
and then here's the file as it appeared in snapshot one.
You'll see that the file hasn't changed, just the name.
So with these two snapshots along the two reports, we now have a
record of how files are changed and a means of monitoring their integrity
moving forward in time.
So how does Fixity identify which changes have occurred to which file,
this table is helpful in interpreting the report, file status,
in order to identify whether a file is changed or removed,
or renamed, Fixity tests whether a file is either present or not
in the directory where it's expecting to find it, whether the checksum has
changed or is the same, whether the file path has changed or not,
and the index location, whether that's different or if it's the same.
And then, depending on the answers to each of these questions the report
labeled each file appropriately.
And just a reminder, we had errors in that last
report,
that last scan. So I got an email
that contained those notes about the report and indicates what changes
occurred, and I also got the actual report as an attachment.
Okay, so let's run the scan, again, this time not making changes to the
files. Now, we'll see that since there have been no changes
since our last scan that the report will be clean, and the manifest
will be the same as the previous one.
So we'll run through... it's running the scan now
and then we'll go first to the report folder,
and now there are three files, 'cause we've run three different scans
and you'll see that there are ten confirmed files. Everything's confirmed
because they're unchanged, so... All is good in our directories.
And then if you look at the history snapshot,
that is the same
list of files with the same checksums as in the previous...
Snapshot. You can see that I...
You can do a comparison of the two by looking at each of
the
manifests.
Okay, so we're gonna do one more scan before we wrap up the
demo and this time we're gonna introduce an error in the data
that replicates bit rot, so it's actually going to change the essence of
the file. So, first, I'm going to actually edit the content of this
image file in a text editor and I'm just gonna go in and
delete a section of the bit so we open the text editor,
that's what the image looks like in a Text Editor. We'll just randomly
select a chunk of the data and delete it,
and you'll be able to see the actual error in the icon.
The junk food in, you can see now that there's a purple band
at the bottom, the file now has this error introduced to it.
So now we'll go back and run that scan one more time,
and we'll see how Fixity picks up the change in both the report
and the snapshot.
So, here's the report and the file is recognized as changed,
and here is a comparison side by side of the current snapshot
and the previous snapshot so the changed file has a checksums that starts
56B79, and that is different than the checksum in the previous
scan. So this is just another indicator that the integrity of the file is
at risk by monitoring the files on a regular basis, errors like this
could be identified and then action can be taken. So the actions aren't
part of Fixity, but Fixity helps you to identify
that something has happened so that
you can actually replace corrupted files, with backups, for example.
So we've been running the scans manually using the Run Now function in
Fixity. It's important to remember, though, that when we first created the
project, we set Fixity to run on the schedule
in this case, weekly on Sundays at 3 AM. The program is gonna
run regularly, and automatically
without us actually clicking Run Now, and it's gonna do so whether it's
open or not. So as long as the computer is on,
Fixity will run in the background and it'll create reports and snapshots
that are collected in the appropriate folders and this really enables an
organization to monitor fixity and attendance over time without a lot of
overhead.
So if you want to get Fixity for yourself, you can go to
our website, avpreserve.ccom tools/Fixity. And you'll find all the information
you need there and you can also learn more
at that same location. Or join our Google Group. We have a Fixity users
Google Group where you can ask questions or get news about latest releases
and things like that. There's definitely a community of users, and we would
be glad if you would join us.
So now, if folks have questions, we have some time to take them.
Great, thank you Amy, we have quite a few questions.
So, I'm gonna take them. I the ones that were similar
and kinda groups together into categories,
but kind of take them as they came in.
So one current Fixity user who uses it on a Mac,
asked this, there were still issues with running
Fixity on the latest macOS update.
I can feel you want to answer. I'll say that
we are currently working on compiling a new version of Fixity that will
be compatible with the latest macOS version. What you would download today
would have issues in the latest, the very latest macOS release.
So that is... In the works.
There were a few people that asked about whether Fixity was available on
Linux. And the answer to that is that it is not...
We have done Mac and Windows releases. There was
someone a few years ago at some point, that had gotten Fixity to
run on Linux, but it was a, it was not us,
and don't officially support it in any capacity,
however if there were enough people that were
expressing interest I noticed there are three on this call and
it is something that we would definitely be willing to look into.
So next question: Can you use this in a server environment or is it desktop
only. You can use it in a server environment,
do you wanna field that further Chris? So yeah,
there's a couple of ways to break it down that might be worth
mentioning, and one is you can get it in a server environment in
a sense that
you can scan files that are on servers, the application, is
locally run,
so there is not a server side, Fixity an application, you would be
running it from a desktop, It's a desktop application that you could run
and scan servers with but it can't be run server side
based on what we've done to date,
and a similar question, Can the tool be scripted
and they refer to a non GUI use of the tool. We have
not created a command line version, is somewhat intentionally, the... At
any point early on in the presentation, there are lots of checksum tools
and the way that many organizations are using those checksum tools is scripted
workflows,
but this Fixity was intended the original focus although organizations,
large and small, have adopted it or started using it, large,
small, sophisticated, with resource, less resource...
the original intent was to give a tool to people who didn't have
IT resources and weren't gonna be the type of people that were creating
scripts using command line application so we wanted to create a simple desktop
GUI tool, and that's what we created to date... There were a couple
of questions that came in on what types of things can be scanned
with Fixity,
and essentially
the answer to that is: Any file, or anything that you can see
from your desktop,
so over a network mount or local or removed drive,
you can scan with Fixity. There are some... There's like one old...
CIS implementation networking protocol implementation, an older one that
has proved to be problematic and we mentioned that and the user guide as
a known issue the way that it manages
the location identifiers
is different than every other protocol. So that throws Fixity off, but other
than that and even someone asks about hidden files
if a hidden file is hidden on your operating system and you can't
access it then Fixity won't, but if if you have permissions or configured
your operating system to
see hidden files, then you will be able to scan those as well,
so that is kind of up to the user
to determine. Lots of questions about recursion. Does it support
sub direct scanning of sub directory? So if you select the top level
directory will it report scanning of sub directory and
the extent how many, how many files will it support and things like
that.
The, the answer yes it, it fully supports recursion and scanning of sub
directories. Limits are gonna be... There is no limit beyond
what one would exist in the operating system itself
as far as number of sub directories, and there's no explicit limits within
Fixity on that. And the quantity of files or the size of files,
there is no limit on, but it's not to say that you won't
encounter the types of issues that you would encounter in trying to do
anything with very large file sizes, or lots of files meaning that if
you're trying to scan, you know, many, many, many terabytes over a network,
then you'll run into the same types of issues that you would run
into if you were trying to do anything over with that much data over
a network. And there's nothing inherently about Fixity that
would cause those issues.
But it'd be the same whether you had scripted a checksum application to
scan that same data over to the network or whether you would use Fixity
to do it. It's gonna be dependent on kind of system performance,
network performance, and things like that.
As a point of reference, we have an 18 Terabyte NAS device that
we routinely scan without issues. Now, one thing that might be useful I've
mentioned this is that the way that we set that up internally with
our NAS is that we have set up several
projects to run and schedule them so that should something go along with
a project, it doesn't take the entire scan down. So that, you know... And
we didn't do that... we kind of just did that,
because we thought it was a good idea at the time,
that wasn't because we had issues and went back did it.
But I think that seems like it's a smart way to do it.
To kind of chunk it into projects so that if there is an
issue can identify... Which project to say no if it does stop the
scan, it doesn't stop the entire project, it would move on to the
next project.
Sorry, Amy I'm a... Holding all the answers here... That's fine,
you're doing a great job. Keep going.
Someone asked if we could explain that this came early on. So I'm
hoping that the presentation answered this question anyways, but could you
explain how Fixity is unique among similar types of fixity checking tools?
Do you wanna field that one Amy?
Sure, I think the difference is that it automatically...
You can set it up to automatically monitor over time your directory
so... So oftentimes, the fixity tools that are out there,
you can set it up to run once and then you would need
to manually monitor or manually run it, so that you can monitor those files,
but this actually does it on a recurring basis and it accrues these files,
these reports in a readable meaningful way that make these activities audible.
Right. Yeah, I think maybe just the simplicity factor, too,
is a difference, a lot of the more sophisticated checksum tools tend to
be command line, tend to be
pretty... You gotta be pretty skilled at working on the command line to
get the same level of functionality on as you do with Fixity. Let's
see, how does the system pick up that a file name has changed.
Does this only work if there's a file in the same directory,
with the same checksum. You know, this question came in and then I
saw a slide that you put up, Amy, that answers this question.
There's that chart that kind of says what the logic is
for each of the declarations that Fixity reports on,
what a name change is, or what a
file move is or something like that, for instance. So I would refer
you to,
that same chart is in the user guide, in the Fixity user guide. They're...
We had to think very throughly through which logic
would end up with which outcomes
and that's what we've captured there.
Over a year or more... Let's see, someone asks... Basically just recognizing
that if you were running this thing for regularly for a long trio
time you're gonna generate a lot of the snapshots, and they're just kind
of wondering they're asking how to organize, or do you delete them, what
do you do with those? I'd say that
really, it can be a lot of files, it's a small amount of
data,
in some ways, it gives you an audit path so I'd really say
It's up to you and what your use cases are and how useful you
find the information, as an organization or as a user. But
I think that
what's nice about it is that it does give you an audit trail
and we've thought about creating another level of functionality which would
actually give you a user interface or visualization across those snapshots
so that you had a more easily navigable kind of intuitive interface for
for that audit history, what's changed over time
and it would be using the data that was in this snapshot.
So we're thinking that maybe a use for it that would
manifest over time, that would be pretty nifty but
yeah, I'd say it's up to the user.
If it really didn't matter much to you, and you were just wanted
to know that... Yeah, what was going on today
and you do think there was a little or no chances,
you'd be going back a year and seeing what happened over time,
then I would say, by all means, you can delete those.
Someone asks Where are the files kept and I assume they mean the
snapshot files and those file. Do you want to take that,
Amy? There's a lot of questions coming in, I'm gonna do some question
organizing here. Sure, the files are kept in the Fixity directory along
side the
executable, so wherever you put that directory on your computer, so you
might put in your applications folder, the history and reports directories
are inside that Fixity directory. So, so today I had it,
I had put it next to my digital collections directories in my documents
directory on my computer. But you can put it wherever you want,
whatever's easiest for you. The important things is that... Those
directories the reports directory and the history directory, they need to
in the location where they are placed by the application when it is
first unzipped, so that's in the same directory as the executable file...
So, in, in that parent
Fixity folder. Let's see, alright, I'm gonna move on to questions that have
come in since we started answering questions, so someone points out that
we may want to basically just mention as well that
files may change and this person references embedded metadata. You know,
that you might change embedded metadata in a purposeful way, and get a
change in the checksum and that's accurate, I think Amy pointed out early
on in the definition that
it doesn't necessarily mean that because something has changed, it's corrupt,
it may be a purposeful change,
so this doesn't... We don't need to give the impression that a change
is a corruption or unwanted and it may very well be purposeful.
But, thanks for clarifying that.
Is there a limit to the number of projects you can set up?
There is no limit. However,
what you will run up against is that there are only so many
hours in a day to run scans, so I, as you're scheduling these in, then
you need to be thinking about now... Now, I think scans can run
simultaneously so if you could schedule to let's say you schedule one
and it's gonna take an hour and you schedule a one to start
a half hour in, and that's gonna take an hour, so you'll be
an overlap of a half hour there. That will work.
You don't need to be concerned about that. However, it's gonna screw up
performance,
network, if it's over the network, it's gonna take up more bandwidth or
they'll be competing for bandwidth, but there is no limit of projects. Alright...
Some of these are answered already.
Is there a time limit for how long his scan can run?
No, no, no, no, not any within Fixity. If there was something within
your computer operating system, that had some type of
sleep but it shouldn't do that because it should be seen as an activity,
so it shouldn't I go to sleep or anything.
One thing, if you are gonna, as Amy mentioned, when you run this... So
as long as the computer is on it'll run, in the Windows environment,
you can set this if you run this as administrator and this is
in the user guide,
in order for it to run when it's logged off, you need to
set these up
as the role of a system administrator within Fixity, because it interacts
with the Windows scheduler
so if you do it as a system admin, then it will or
a system user,
then it will even if you're logged off the scans will run,
if you're a user and you don't have a system admin role, then
it won'tl. Now that can be problematic in
some organizations, but that is how that works.
Someone asks: Wondering how this tool might complement bagger or Bagit.
Yeah, we've talked about having Fixity recognize bags
and running basically kind of running on top of that and having some
shared functionality there, but it's something that's only been discussed
to date. We have not acted on that
in any way, but I think there's for organizations that
store their data as bags
that could be very interesting. And of course, our Exactly tool uses,
is all about bags so we've talked about how those may had tie together,
but yeah, if there's interest in that or use cases, we would love
to hear from people about that.
Can Fixity reference files on LTO tape?
Or is that even necessary?
It could, if it were, say, LTFS meaning that it will act just
like a removable drive an LTO tape will load up on an operating
system like a removable drive.
It could access it in that way if it were an LTO Tape which
had data written to it, using a third party software
on something like this Symantec or Veritas or
Retrospective something like that, than no, because that uses a proprietary
way of writing the data that would not make the files available in a
transparent way to the operating system. In
the history snapshot, in column C, what does the long string of digits represent?
Do you have a quick reference point that we could look at there,
Amy in the slides?
I don't yeah, and I... So I had that too, that I...
So it's the file path index.
Okay, good, thank you.
So is that the inode? Is that what we're talking about...
I'm guessing that's probably what that is, is
if on different things in different operating systems, but
that is an identifier, that helps us know,
for instance, if a file if a file name has changed,
it will maintain the same identifier, within the operating system.
So inode is what allowes us to know that a file is the
same file, even if the file name has changed,
because it's got the same index value. Now, I think again I would
refer to that logic because I know that there is something about how
the inode and the checksum play off with each other, in order to
make these determinations that
that's what that is.
Someone says that they can run Fixity on a Windows machine which was
pointing out their Linux files sever speaking to the Linux component.
What happens when you're monitoring on a remote server, and you have connectivity
issues...
Well, connectivity issues will impact this in the same way that it would
impact anything trying to access data over a remote connection, so you would
be impacted in the same way to do 'what if' you were trying
to use those files, for instance.
Five people are asking about the size, they wanna know
specifics, if Fixity usually works with very large directories, 10 terabytes
or more. I saw a couple other questions come in,
can you tell me that it will for sure work for
a large file of 100 gigs
or I think someone else referenced to
80 terabyte, 50000 files, what I can say is, there's no reason it
shouldn't. I haven't tested 80 terabytes, 50000 files necessarily, I think
the biggest test we've run it on up 20 terabytes and
probably a whole lot more than 50000 files. A hundred gigabytes definitely,
yes, we've tested that. I can tell you it'll do a file of
100 gigs for sure.
Enter large directories, this person references 10 terabytes or more. Yes,
we... We do that...
Someone gives kudos to the slide design
and
I'm trying to see which questions we have not touched on
already... Can you use Fixity trying to identify duplicate files in different
locations?
Yes and no, but I would hesitate to say yes because
it would be somewhat awkward and cumbersome. So I would say it is
not a tool design to do that. Is that a deduplicator.
But you could potentially run scans, and then
pull the data into Excel or some other spreadsheet application and sort
data in such a way that you could get an answer to that
question, but I would not... It's not what I would recommend that it
be used for
just because there's probably more efficient ways to do that.
Can it read a hash generated with other tools?
Well, this is where we've talked about integrating with Bagit bags that...
If we were gonna say ingest a bunch of bags and I'm using
that term ingest in a loose wway here...
It could potentially grab the checksum from the manifest within a bag...
But we haven't done that currently today. And the answer to your question
today is, no, it does not, it generates, the tool, it generates it's own
checksums. Now, if you had tool that created a
data... Data in the Fixity
format, you can import projects. So there's a potential there
but I don't know of anybody that's doing that today.
How long does it take to scan our 18 Terabyte NAS? I think
to really answer that question is, I'm not sure I don't...
I haven't timed it, we do it once a week.
And like I said, I break that 18 terabytes into
several projects maybe 12 projects. I'm just kinda guessing but somewhere
in that neighborhood, and
I'm not sure how long it takes a while, 'cause it's over a
network, it's over a dual CAT6 connection
and using a managed switch,
and we tend to run it at night, so, and there's not gonna
be competing for bandwidth through the switchers. That's about as much
as I can tell you. Would Fixity flag
to identical files with different file names.
Well, it would identify those, so it wouldn't explicitly declare that there
are identical checksums, but you would see that there were multiple checksums,
multiple checksums don't necessarily mean identical files either.
So, but
so I would say again, you could get there, you could answer that
question, but it wouldn't be the most efficient tool for answering that
question. Other question about Fixity reading other hashes and stuff so
I guess... There's some interest in that.
Would Fixity fail silently, if you have connectivity issues...
There, it's gonna depend on the operating system on how this manifest.
I will say... You don't currently get an explicit notification that it's
failed. So you wouldn't get an email saying that it failed,
and there's a couple in the window to let the windows scheduler,
you could check that to see if it ran and how successful it
was.
I yeah, that's been an area where actually we wanted to improve upon
it. So that's a good question. And raising a good kind of future
feature that we should incorporate. If there is a failure for it to
report in a more apparent way than it does currently, so it...
I'd say that it more or less does kind of fail silently right
now, you have to take a more proactive stance recognize that you didn't
get the email that you were supposed to get, or there's something or
actively go check the scheduler or something like that.
Currently.
If there's a connectivity or other hardware issue would the scan re run?
And is it gonna be different on different operating systems, there's a preference
you can check to say if the system is not on
at the time that was supposed to run when the system starts
it should run.
So that's gonna be, again, different between the macOS, and the Windows
OS schedulers.
So that's kind of a... Half yes,
now, the question is actually, if there's connectivity or other hardware
issues that's not exactly what I answered, mine is more if this system is
off, if it fails, it's not gonna know that it failed and re run,
it would just pick up at the next scheduled period. Alright,
questions are still coming in for those that are still on with us.
I'm gonna keep going through these,
we won't call in too much longer here.
So, I wanted to add, a couple people asked about
arguments for either MD5 or SHA 256. I'll say that there's plenty of
them. People... It's a public and a lot of time on...
We've offered those because the both are of interest
to the community and used in the community.
We've also had people ask us about CSV
because the advantage,
one advantage to MD5 over SHA 256 is that it's faster,
it's a less computationally intensive algorythm, so it just takes less time
to do.
But there are other reasons people feel, I was gonna say it's gonna
depend on your environment in a degree to which
you securities issues, as in hacking replacing altering the EDs and
expert way is likely to happen to impact your collections.
So people that are more security conscious will go with SHA 256. MD5 are
likely to have collisions, or more likely to have a collision, meaning that
you could have two files that are not identical that end up with
the same MD5. It is extremely, extremely, extremely unlikely.
And there have been studies done on this and...
And
that demonstrate that in fact it is highly unlikely
and that only really matters. We have all this additional metadata we have
file path we have
file names, we have a lot of other information here. So the question
we wouldn't even if there were multiple checksums, the same checksums for
different files,
it wouldn't really matter for the question set that were posing with Fixity,
it wouldn't cause any confusion or anything along those lines. So those
are just... That's a little bit of discussion. I'm gonna move on.
Someone asks about this sub directory, it says when there are sub directories,
is the content of the sub directories scanned? And answer is yes.
Someone asks if they can run it,
run a large directory that has lots of sub directories and of disregard
some of those sub directories. And I'd say currently no, there's no way
to do that, there is a way to filter out hidden files or
something like that, so I'd have to look in, I don't think we,
I can't remember what the logic is if its
file name ends with or if it's file name contains... But I think
that the answer to that question is that, No, you couldn't really do
that. You'd have to set up different projects for each of those sub
directories that you wanted to run.
Someone asks about
some mechanism for documenting when Fixity has changed
but purposefully which is a great, that's a great idea,
we haven't thought about that. And that hasn't come up but I like
that idea. Cool, I'm gonna go... add that to the list of things
to think about.
So, would you advise that if you were scanning files on a server
that you have the email sent to you whether or not there is
an issue, would it indicate the scan ran partially? Yeah, absolutely
that could indicate it. We just wanted to get people the options so
that if they wanted less emails in their inbox,
they could have them... But yeah, that would be a good way to
kind of keep an eye on things and know when something has not
gone right. Can a corporate SMTP server be selected for sending Fixity email
notifications?
Yes, that's say that email is where a lot of problems come up
in problems because of port blocking and things like that, that happens
in certain... We've given some users and ability to configure ports and
stuff but
there can be issues and they're gonna be specific to the security of
a given organization and there's only so much control we have over that.
It's really gotta be a conversation with it about getting
the application approved report then blocked, or something along those lines.
I think I have hit on everything that
we've gone through.
Alright, I'm gonna leave it there, Amy while I was rambling on,
did you think of anything else you wanna say before we sign off
for...
No, I think we've covered everything that...
Well, we covered in the demo.
Great,
alright, well thanks to everyone for joining and thanks to those who stuck
it up to the end... And we appreciate you joining us today,
and hope that if you're not a user of Fixity you'll become a user of Fixity.
We hope to see you in a user group or
some other form. Thanks a lot, and thank you Amy. Thank you.
Take care.