|
Goto Previous Page Page # 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 Next Page
Steven Karel
Administrator
November 4, 2001
09:50:10 AM
|
Jargon |
| Since this is the technical support website, and this is a university, a short primer on figuring out what someone is saying:
|
Anonymous Poster
November 16, 2001
08:32:54 AM
|
I have a better idea, Mr. Karel. Why don't you simply explain the term in your post when you use it, like the other Anonymous person above suggested?
|
Danny Silverman
November 18, 2001
02:34:09 AM
|
I have a better idea. Use your name. It elevates the discussion. From Anti-dmca.org: The DMCA is the Digital Millenium Copyright Act, passed by the U.S. Congress in 1998, supposedly to update copyright law for electronic commerce and electronic content providers. Unfortunately, this law is very poorly written, and is now regularly used by corporations to restrain the three primary concessions of copyright and otherwise prevent free speech activity.
|
Jesse Grittner
November 24, 2001
02:45:23 AM
|
Posting this thing might spike my u/l levels for the week =) |
| Look, this is ridiculous. The hostility and rudeness of some of the anonymous posters is getting out of hand, and I, for one, think they need to grow up. I just want to say a couple of things and then get out of the way of the flames... 1. People need to stop taking this stuff so personally. If you are in the top 10, even if you are (gasp) #1 on the list, no one is calling you a bad person. I'm sure that Rich and everyone else truly believes that you are a beautiful and unique snowflake. As someone who has made the list a few times myself, I just don't understand how anyone can be offended by the discussion taking place here. The numbers don't lie - if you are sending or receiving a lot of bits, that's it. 2. This discussion is *important*. It's about the regulation of a shared and limited resource. Those two factors combine to make the problem one of balancing. What Rich has asked for from the beginning is input on how best to perform that balancing. The same sort of discussion is taking place all over the country, as university administrators realize that bandwidth is expensive and will (usually) be filled as quickly as it can be purchased. For all of the failings Brandeis may have, I'm very glad that this school's approach has been one of conversation and debate, as opposed to adminstrative decisions imposed from on high. So let's try to be a little more mature about it, shall we? 3. If you want to take part in the debate, it is your job to familiarize yourself with its terms. I am relatively confident that none of the posters have used jargon in order to cloud the issue or to enhance their egos. Terms like DMCA, bandwidth, and bottleneck all have fairly standard definitions. Explaining them every time they're brought up is a waste of time, especially in a forum that is, by necessity, technically oriented. You're college students (maybe grad students, for all I know) - demonstrate some of that intellectual curiosity that got you admitted to the 'Deis in the first place and look something up. 4. Sorry this is so long. I'm a COSI major, but I'm also applying to law schools for next year. So you get a treatise that is not only crammed with technical jargon, but is also mind-numbingly long.
|
Travis Seifman
December 7, 2001
03:16:05 PM
|
Response to Posting this thing might spike my u/l levels for the week =) |
Why are people getting so annoyed about this? It's not because they're on the list, because we all agree with exactly what you said. All the list means is that some people are using more than others, and that's it. The problem arises when those people get attacked for it, either in the form of warning emails, or restricted access. No one should be punished for making use of a resource provided us (and, i do believe that we pay for the Ethernet access as part of our room&board anyway). THAT is what people are upset about, and THAT is something that should not be going on.
|
Jesse Grittner
December 7, 2001
04:51:22 PM
|
Response to Travis |
"The problem arises when those people get attacked for it, either in the form of warning emails, or restricted access. No one should be punished for making use of a resource provided us"I think it's overly simplistic to say that people are being punished for "making use of a resource." The warning e-mails and access restrictions come about because, practically speaking, some people's usage negatively affects other people. When their bandwith usage overwhelms others', I think that some sort of a line has been crossed. Brandeis provides us with a resource in the form of bandwith, but that does not mean it's an unlimited resource. That's why it's fair for the school to try to ensure equal access to everyone.
|
Danny Silverman
December 9, 2001
03:35:05 AM
|
Moving Towards An (Unofficial) Solution |
| So when I came to Brandeis I ripped most of my CDs to MP3 and then left the disks behind. When the Great Hard Drive Crash of 2001 left me without my 5 gigs of MP3s, I had to start rebuilding my collection without the CDs. For the longest time I've been having trouble finding a few songs. So on Friday at about midnight I decided to do something about it.
Remember Rich's discussion of Morpheus/KaZaA traffic and his fun neo.php script, I decided to start a catalog. This catalog has two purposes:
- To allow people to search for and access files that are shared by other campus users over Morpheus/KaZaA.
- To allow people to search for and access files that are available on the local Windows network.
Objective one is now complete. It works like so:
- My computer does a service scan of all the dorm subnets, looking for computers with open port 1214 (Morpheus/KaZaA).
- This list is fed into a spider that goes and fetches each page and parses the data into a MySQL database.
- The MySQL database is indexed for full text search.
- A web interface gives everyone a convenient access method.
On my last catalog the crawler picked up over 30,000 files. There are some limitations to the system due to the use of Morpheus/MySQL:
- No metadata searching. You get filenames only, no ID3 tags in music files
- No short words. MySQL 3.x requires you to have 4 characters or more in the search. Don't try looking for just "U2" because you won't find anything.
- No booleans. MySQL 4.x supports this, but I don't know if that release is stable enough for me to upgrade.
- Finally: no guarantees! If someone has turned off Morpheus (or their computer) since the time of the scan, their files will no longer be accessible.
With these restrictions, I am still confident that this system will be very useful in alleviating some of the bandwidth problems. Already I have been able to find several songs and download them at over 300KB/s. I hope other people will use this system responsibly. Recognize that I am still tweaking it and it might not always be working. I will also try to get SMB support in for searching the local network. This will most likely be handled by pre-built libraries like the system being used at RPI, so I'll have to figure out how to bodge together their system and my Morpheus search. That will probably come after break.
On the edge of your toes? Here is the system: Boogle
Happy hunting!
|
Sahil Tandon
December 9, 2001
04:11:49 AM
|
Response to Reducing network bandwidth use with file sharing programs |
| Danny, Despite the beta status, this is very impressive, and more importantly, cool! :) Keep up the good work and let me (and anyone else interested) know what we can do to possibly help you speed up the process to reach a fully functional inter-college search engine.
|
Anonymous Poster
December 9, 2001
10:20:09 AM
|
Danny- Your Boogle is great, but as I recall, there was a post on here advocating programs that search the LAN...I don't see that post anymore...But do a search on Google for Sharescan.
|
Steven Karel
Administrator
December 9, 2001
11:49:07 AM
|
Without addressing the question of whether this creates liabilities for anyone, Anonymous Poster (Dec 9) could just write a plug-in front-end for Danny's database to search the Windows fileshares. Use php to call smbclient to look for open file shares, and then have the spider part use smbclient as well.
|
Anonymous Poster
December 9, 2001
12:22:31 PM
|
Re: Boogle
I wrote pretty much the exact same program about a month ago, but was persuaded not to run it by the network people because of the possibility of the university being held liable (see this thread).As far as the actual implementation goes, why are you using a database? That's serious overkill for this application. You're better off just dumping URLs into a file and then grepping it (after appropriate shell-sequence related paranoia, of course). That'll get rid of your short word and boolean problems, as well as save you memory and CPU time. I've also got a faster script to generate the file list, if you're interested.
|
Danny Silverman
December 9, 2001
01:38:51 PM
|
Wow, I must have missed that old thread. Very interesting. My
first thought was to keep this private, but I am the kind of person
who likes to make everything I do open (see agblog.com ;-) so
as to make my purposes clear. I want this to be open to
discussion. Since my crawler is basically doing the same thing
that Morpheus does anyway, only on a local scale, it seems like
it will be fine until an American court shuts down Morpheus. :-)
Faster scripts are good, although mine is pretty snappy right
now. I wasn't looking for polished, just workable. And I was
hesitant to do this until I saw the RPI site, which is so blatant,
and seems not to be having any problems. Besides, what if you
want to find a copy of Sheffries Phandemonium, where else will
you turn? :-D
I used a DB because 1) I'm no good at Unix stuff but know
PHP, 2) It seemed like with an expectation of around 100,000
files it would be appropriate and 3) I just had one running
already. Does a nice bubble search or whatever, gives rankings,
all very easily.
I'm totally not against improvements, and I figured that for
Samba stuff I would end up integrating Phynd, since its all there
already, and, if I'm getting my programs correct, it is the one that
is opt-in, where you type in your computer name and then select
shares to catalog. Or is that the other one? But that way, its
voluntary. Whichever one is the voluntary one is the one I would
use. Not that you aren't being voluntary in just leaving open
shares on your machine...
Seperate note: I use my server to run several web sites that I would like to keep
up. If people at UNet find a problem with this service, please
contact me instead of just dumping my network connection.
Thanks. ;-)
|
Anonymous Poster
December 9, 2001
11:52:04 PM
|
awesome work danny, keep it up! implementing some part of phynd is a great idea, there should be a way where files (movies, etc.; and not only mp3's) can be searchable. i agree that using a database might not be the best solution.. but who knows ?
|
Anonymous Poster
December 11, 2001
08:33:08 PM
|
Think copyright laws aren't being actively enforced?
"Raids were carried out today at the University of California at Los Angeles, the Massachusetts Institute of Technology, Purdue University, Duke University and the University of Oregon, officials said."
See
this New York Times article
|
Yonatan E. Samlan
December 11, 2001
09:28:37 PM
|
raids on piracy |
| Those recent raids were raids on members of the hacking group DoD (Drink or Die), of actual crackers/releasers of hacked software who had terabytes and terabytes (THOUSANDS of gigabytes) of pirated software. Don't try to confuse the issue by crying "wolf". When they start to raid casual warezers or casual mp3 downloaders, 99.9% of computer-literate people worldwide (especially those with broadband) are screwed. And there couldn't be enough courts in the world to handle cases against 99% of the population ages 12-30. So moving with the understanding that this stuff is illegal (whether rightfully so or not is another question), but most users face little danger of legal action. Kinda like pot (hate to draw parallels) but it's something a lot of college students do and feel should be legal, but as of now isn't. But casual users who don't act stupid are relatively safe as of now (that may change). Piracy raids are nothing new. It's not the first time a major piracy op has been raided, and it certainly won't be the last, but it's nothing new so far as I can see.
|
Danny Silverman
December 11, 2001
10:05:28 PM
|
Philip Bond, the Commerce Department's under secretary for technological policy, said cyber-pirates steal an estimated $12 billion worth of technology and goods a year, according to the Business Software Alliance. American leadership in computers and software is "very much at stake" because of piracy, he said.
And one must ask - if this software is being "stolen" at the university level, would the students have it otherwise? Businesses is one thing, but does anyone really expect a student to go out and spend $10,000 on Maya? This $12 billion figure is likely grossly inflated, especially coming, as it does, from the extremely biased BS Alliance. They're assuming full market price for every copy of Photoshop, assuming again that that many people would go out and buy the software to use if they could not get it free. Then assuming that everyone who does get it would upgrade every time an upgrade came out if they had to actually pay for the upgrade. And that every copy someone owns is a new license (not, say, an OEM thing). That's a lot of assumptions, most of them wrong. Remember, when you assume it makes an ass out of you and me.
And if Brandeis offered education discounts like most other schools out there, this might help a bit also with the piracy problem...if I could get Mac OS X 10.1 for FREE like EVERY OTHER COLLEGE STUDENT, I wouldn't have had to download the fricken 650MB image from a warez site. I mention this in a public forum because OS X 10.1 was a free update, so realistically I broke no laws, but it was annoying as hell.
And hey, if you actually had to pay for Photoshop, I know that I, for one, would just start using The GIMP.
|
Danny Silverman
December 13, 2001
03:17:44 AM
|
In Other News... |
| Boogle 2.0b1 is out for testing. Hit it and play with it. There are more options, about 300,000 more files catalogued, and a slicker interface. Note that Boogle is restricted to 129.64.*, so no trying to access it from off-campus.
Boogle 2.0 is brought to you by me, Danny Silverman, and by Peter Williams and Nat Budin, who basically did all the hard work on the SMB support. Also thanks to Tim Hickey for letting me work on this project for COSI. The system is running on my Mac OS X box. The spiders are running manually right now, since everyone is leaving anyway and it seemed dumb to have them crawling with no one around. I don't plan to update the file lists until after break. With that said, get 'em while they're fresh. And please try to stay as legal as possible. ;-)
|
Rich Graves
December 13, 2001
12:40:29 PM
|
http://boogle.agblog.com/ looks pretty cool, though you might consider trying to come up with a more original brand name. A quick [failed] search for system32, windows, etc. suggests that you've successfully excluded system directories. Technical correction: Gnutella is not the same thing as Morpheus/Kazaa. UI suggestion: Uncheck SMB if USER_AGENT is anything but MSIE for Windows. EFF Counsel Lee Tien presented a sort of updated version of http://www.eff.org/IP/P2P/Napster/20010227_p2p_copyright_white_paper.html at USENIX. It's worth noting that the EFF *lost* important bits of both the 2600 and MusicCity cases within the last few weeks. Still I think you have a reasonable defense on fair use grounds, since it's limited to local campus. You don't qualify for the safe harbor for search engines because you lack a posted copyright policy, agent, etc. You can start on the road to that by linking to http://www.brandeis.edu/copyright.html. Copyright holders annoyed by things found by boogle should be instructed to send a standard notice (note that the law provides a boilerplate that should be followed for a complaint to be taken seriously; Stanford has a nice outline at http://www.stanford.edu/group/itss-ccs/security/dmca.html) including the IP address, filenames, and verification of copyright ownership to copyright@brandeis.edu. "If another University (RPI) uses it and boldly publicizes it through a website" Um, RPI isn't doing anything. phynd is an individual student project, just like yours.
|
Danny Silverman
December 13, 2001
02:08:27 PM
|
We've excluded anything similar to C$, so unless someone specifically shared their Windows dir, I think we're okay. Removed gnutella reference (was trying to make it easier to understand, guess it just complicates things.) SMB suggestion implemented. http://www.brandeis.edu/copyright.html Linked to. Brandeis U does qualify as a safe harbor, right? As for the RPI, can't find that comment on this page, but I think I was just saying that its blatant, not that its university-supported.
|
Steven Karel
Administrator
December 13, 2001
02:24:11 PM
|
Does Boogle obey robots.txt files? It really should. If so, where are you looking for them in SMB shares? At the top level in a file named ROBOTS.TXT?
|
Goto Previous Page Page # 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 Next Page
Post A Response
|